Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changing float to string precision #137

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

MichalBrzozowski91
Copy link

@MichalBrzozowski91 MichalBrzozowski91 commented Dec 28, 2020

Function fmt.Sprintf("%f", some_float_number) converts float to string with a default precision equal to 6:
The default precision for %e, %f and %#g is 6; for %g it is the smallest number of digits necessary to identify the value uniquely.
(source).
Changing format from "%f" to "%g" gives a precision necessary for saving floating point numbers without loss. According to the above documentation this precision is also a default option for converting floats to strings.

The current precision leads to the inconsistencies even when we only open and write a csv file again. Minimal working example:

Running a program:

package main

import (
	"os"

	"github.com/go-gota/gota/dataframe"
)

func main() {
	csvfile, _ := os.Open("dataset.csv")
	df := dataframe.ReadCSV(csvfile)
	f, _ := os.Create("output.csv")
	df.WriteCSV(f)
}

with a csv file:

index,value
0,0.00051124743

produces the output csv file:

index,value
0,0.000511

Suggested change solves this issue.

Moreover the casting to string is used in the function Rapply. Because of that applying an identity function to a dataframe changes it. Minimal working example:

Running a program:

package main

import (
	"fmt"
	"os"

	"github.com/go-gota/gota/dataframe"
	"github.com/go-gota/gota/series"
)

func main() {
	csvfile, _ := os.Open("dataset.csv")
	df := dataframe.ReadCSV(csvfile)
	g := func(s series.Series) series.Series { return s }
	dfApplied := df.Rapply(g)
	fmt.Println(dfApplied.Elem(0, 1).Float())

}

with a csv file:

name,value
a,0.00051124743

prints only 6 digits:

0.000511

This issue is solved as well by the suggested change.

kalorz added a commit to elpassion/gota that referenced this pull request Mar 3, 2021
@chrmang chrmang changed the base branch from master to dev April 22, 2021 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant