This article describes a method for converting incorrectly formatted numeric strings in CSV files into doubles by replacing commas with dots and handling various input errors, complemented by a Java util class implementation and tests.
Introduction
Recently, I was solving a straightforward issue. The technical documentation about the input format was straightforward. A particular column in the CSV document represented numbers in double format. However, users were allowed to interact with CSV files and make manual edits. This lead to a situation where the user or the processing program changes the number to a standardized format or even to an incorrect one. In most cases, users change the dot to a comma.
Thus, if possible, I needed to devise a simple way to fix the incorrect input and cast the string to double.
Implementation
I have created a simple util class with a single static method to fix the incorrect string format.
The method implementation catches and solves the following cases:
- The method throws an exception if the input string is null or empty.
- The method throws an exception if the input string contains more than one separator or a mix of different separators.
- The method throws an exception if the input string is not a number.
- Otherwise, the process replaces the comma with a dot and parses the string to double.
Casting can still fail and throw NumberFormatException
if the input string is not a number. parseStringToDouble() method will not handle the failure, and it will be up to the caller to catch the exception and manage it. Feel free to adjust the method according to your needs.
Here is a simple algorithm for how to cast string to double:
public static class StringUtil {
private static final String DOT = ".";
private static final String COMMA = ",";
private static final String EMPTY = "";
/**
* Parses a string to a double.
*
* @param numberStr
* @return the parsed double
* @throws IllegalArgumentException if the input string is null or empty, or if the number is ambiguous
*/
public static Double parseStringToDouble(final String numberStr) {
if (numberStr == null || numberStr.isEmpty()) {
throw new IllegalArgumentException("Input string is null or empty");
}
val numberStrWithReplacedCommas = numberStr.replace(COMMA, DOT);
int dotCount = numberStr.length() - numberStr.replace(DOT, EMPTY).length();
int commaCount = numberStr.length() - numberStrWithReplacedCommas.length();
if (dotCount + commaCount > 1) {
throw new IllegalArgumentException("There is more than one separator, or a mix of them, the number is ambiguous");
}
return Double.parseDouble(numberStrWithReplacedCommas);
}
}
Feel free to use this simple algorithm in your projects and extend it according to your will. You may not need to throw an exception if the input string is null or empty or if the number is ambiguous. You may want to catch NumberFormatException
from parseDouble method and return 0.0 in such case. You could return empty Optional. It is up to you.
Tests
I also wrote a few tests to verify that the algorithm works correctly. Please check it and adjust it according to your will. Here are the tests:
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;
import lombok.val;
class StringUtilTest {
@Test
public void testParseValidStringWithDot() {
val result = StringUtil.parseStringToDouble("123.45");
Assertions.assertEquals(123.45, result, 0.000);
}
@Test
public void testParseValidStringWithComma() {
val result = StringUtil.parseStringToDouble("123,45");
Assertions.assertEquals(123.45, result, 0.000);
}
@Test
public void testParseStringWithMultipleSeparatorsShouldThrowException() {
Assertions.assertThrows(IllegalArgumentException.class,
() -> StringUtil.parseStringToDouble("123,45.67"));
}
@Test
public void testParseStringWithInvalidCharactersShouldThrowException() {
Assertions.assertThrows(IllegalArgumentException.class,
() -> StringUtil.parseStringToDouble("123..45"));
}
@Test
public void testParseStringWithOtherInvalidCharactersShouldThrowException() {
Assertions.assertThrows(IllegalArgumentException.class,
() -> StringUtil.parseStringToDouble("123,,45"));
}
@Test
public void testParseNonNumericStringShouldThrowException() {
Assertions.assertThrows(NumberFormatException.class,
() -> StringUtil.parseStringToDouble("not a number"));
}
@Test
public void testParseEmptyStringShouldThrowException() {
Assertions.assertThrows(IllegalArgumentException.class,
() -> StringUtil.parseStringToDouble(""));
}
@Test
public void testParseNullShouldThrowException() {
Assertions.assertThrows(IllegalArgumentException.class,
() -> StringUtil.parseStringToDouble(null));
}
}
Conclusion
In simple algorithms, we have seen how to handle multiple cases of incorrect string input and how to cast string to double.
But in the end, it is up to you how to handle the cases when the input string is null or empty or if the number is ambiguous. You can throw a custom exception, return 0.0, empty Optional, or do something else.
Did you find an algorithm for casting strings to double easily? Do you have your trick or know another way how to change arbitrary string to double? Let us know in the comments below the article. We would like to hear your ideas and stories.