Java - splitting text into array without obvious delimiter

I need to split each line of text into an array using a loop. The problem is that there is no obvious delimiter used when formatting the text file (which I cannot change):

Adam Rippon      New York, NY    77.58144.6163.6780.94
Brandon Mroz     Broadmoor, CO   70.57138.1266.8471.28
Stephen Carriere Boston, MA      64.42138.8368.2770.56
Grant Hochstein  New York, NY    64.62133.8867.4468.44
Keegan Messing   Alaska, AK      61.15136.3071.0266.28
Timothy Dolensky Atlanta, AL     61.76123.0861.3063.78
Max Aaron        Broadmoor, CO   86.95173.4979.4893.51
Jeremy Abbott    Detroit, MI     99.86174.4193.4280.99
Jason Brown      Skokie Value,IL 87.47182.6193.3489.27
Joshua Farris    Broadmoor, CO   78.37169.6987.1783.52
Richard Dornbush All Year, CA    92.04144.3465.8278.52
Douglas Razzano  Coyotes, AZ     75.18157.2580.6976.56
Ross Miner       Boston, MA      71.94152.8772.5380.34
Sean Rabbit      Glacier, CA     60.58122.7656.9066.86
Lukas Kaugars    Broadmoor, CO   64.57114.7550.4766.28
Philip Warren    All Year, CA    55.80113.2457.0258.22
Daniel Raad      Southwest FL    52.98108.0358.6151.42
Scott Dyer       Brooklyn, OH    55.78100.9744.3357.64
Robert PrzepioskiRochester, NY   47.00100.3449.2651.08

      

Ideally I would like each name to be in [0] (or a name from the name of [0] in [1]), each location should be in [2] or also in two different indexes for city and state, and then each the score must be in its own index. There are four separate rooms for each person. For example, Adam Ripton's estimates are 77.58, 144.61, 63.67, 80.94

I cannot split the spaces because some cities have a space between their name (for example, New York will be split into New and York on two different array elements, while Broadmoor will be on the same element). It is not possible to separate cities with commas because Southwest FL does not have a comma. I also cannot split the numbers by decimal point, because those numbers would be wrong. So is there an easy way to do this? How possibly a way to divide numbers by the number of decimal places?

+3


source to share


6 answers


It looks like there is a fixed size for each column. So in your case, column 1 is 17 characters long, the second is 16 characters and the last is 21 characters long.

Now you can just iterate over the lines and use the method substring()

. Something like...

String firstColumn = line.substring(0, 17).trim();
String secondColumn = line.substring(17, 33).trim();
String thirdColumn = line.substring(33, line.length).trim();

      

To extract numbers, we could use a regular expression that searches for all numbers with two decimal places.



Pattern pattern = Pattern.compile("(\\d+\\.[0-9]{2})");

Matcher matcher = pattern.matcher(thirdColumn);

while(matcher.find())
{
    System.out.println(matcher.group());
}

      

So in this case it 47.00100.3449.2651.08

will output

47.00
100.34
49.26
51.08

      

+7


source


It looks like each column has a fixed size (number of characters). As you said, you cannot split by tabs or spaces because of the last line where there is no bookmark or space between name and city.



I suggest reading one line and then breaking the String into line.substring(startIndex,endIndex)

. For example line.substring(0,18)

for the name (if I calculated correctly). Then you can separate that name in the first and last name using a space as a separator.

+1


source


Assuming the fields are fixed width, which is what it appears to be, you can perform substring operations to get each field and then parse accordingly. Something like:

String name = line.substring(0,x)
String city_state = line.substring(x, y)
String num1 = line.substring(y,z)

      

Etc. where x, y and z are column breaks.

0


source


It seems to be the old old fixed position format. It was very popular in the days of reading punch cards.

So, basically, you read this file line by line and then:

String name = line.substring(0,17).trim();
String location = line.substring(17,33).trim();

String[] scores = new String[4];
scores[0] = line.substring(33,38);
scores[1] = line.substring(38,44);
scores[2] = line.substring(44,49);
scores[3] = line.substring(49,54);

      

Then you can go ahead and split the name by space, location by ,

, convert scores to numbers, etc.

If you want to make all of the above more general, you can prepare a list of indices and create an array based on those indices:

int[] fieldIndexes = { 0, 17,33,38,44,49,54 };
String values[] = new String[fieldIndexes.length - 1];

      

And then in your read loop (again, I'm assuming you read line in line

):

for ( int i = 1; i < fieldIndexes.length; i++ ) {

     values[i-1] = line.substring(fieldIndexes[i-1],fieldIndexes[i]).trim();

}

      

And then go to work with the array values

.

Of course, make sure that every line you read has the appropriate number of characters, etc., to avoid limit issues.

0


source


Why don't you split by index? The coordinates are tricky, but if you always have two numbers after the decimal points, then this example might help.

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;


public class Split {

    public static void main(String[] args) throws IOException {

        List<Person> lst = new ArrayList<Split.Person>();

        BufferedReader br = new BufferedReader(new FileReader("c:\\test\\file.txt"));

        try {
            String line = null;

            while ((line = br.readLine()) != null) {

                Person p = new Person();

                String[] name = line.substring(0,17).split(" ");
                String[] city = line.substring(17,33).split(" ");

                p.setName(name[0].trim());
                p.setLastname(name[1].trim());
                p.setCity(city[0].replace(",","").trim());
                p.setState(city[1].replace(",","").trim());

                String[] coordinates = new String[4];
                String coor = line.substring(33);

                String first = coor.substring(0, coor.indexOf(".") + 3);

                coor = coor.substring(first.length());

                String second = coor.substring(0, coor.indexOf(".") + 3);

                coor = coor.substring(second.length());

                String third = coor.substring(0, coor.indexOf(".") + 3);

                coor = coor.substring(third.length());

                String fourth = coor.substring(0, coor.indexOf(".") + 3);

                coordinates[0] = first;
                coordinates[1] = second;
                coordinates[2] = third;
                coordinates[3] = fourth;

                p.setCoordinates(coordinates);

                lst.add(p);
            }

        } finally {
            br.close();
        }

        for(Person p : lst){
            System.out.println(p.getName());
            System.out.println(p.getLastname());
            System.out.println(p.getCity());
            System.out.println(p.getState());
            for(String s : p.getCoordinates()){
                System.out.println(s);
            }

            System.out.println();
        }
    }

    public static class Person {

        public Person(){}

        private String name;
        private String lastname;
        private String city;
        private String state;
        private String[] coordinates;
        public String getName() {
            return name;
        }
        public void setName(String name) {
            this.name = name;
        }
        public String getLastname() {
            return lastname;
        }
        public void setLastname(String lastname) {
            this.lastname = lastname;
        }
        public String getCity() {
            return city;
        }
        public void setCity(String city) {
            this.city = city;
        }
        public String getState() {
            return state;
        }
        public void setState(String state) {
            this.state = state;
        }
        public String[] getCoordinates() {
            return coordinates;
        }
        public void setCoordinates(String[] coordinates) {
            this.coordinates = coordinates;
        }
    }

}

      

0


source


Read line by line, then adjust the appropriate limits on each line. eg:.

private static String[] split(String line) {
    return new String[] {
        line.substring(0, 16).trim(),
        line.substring(17, 32).trim(),
        line.substring(33, 37).trim(),
        line.substring(38, 43).trim(),
        line.substring(44, 48).trim(),
        line.substring(49, 53).trim(),
    };
}

      

0


source







All Articles