How to get the sum of two elements in a list of arrays in Scala

Hi and thanks in advance for reading this time.

I am writing a piece of code in Scala to read a data file and create a couple of aggregations. For simplicity, let's assume the content looks like the following (entries are tab delimited):

01/12/2015 JACK M 21XYZ 56 200

01/14/2015 JOHN M 22ABS 34 145

I want to multiply the last two numbers and store them together with the second element (Name) and then run some statistics (min, max, top 10, etc.)

The steps I have taken so far:

1- Read the file

    val dat = scala.io.Source.fromFile("abs.txt")

      

2- Put content in the list

    val datList = try dat.getLines.toList finally dat.close

      

3- Split each string into an array of strings

    val datArray = datList.map(_.split('\t'))

      

after these steps i have an Array of strings. I am stuck at this point. I don't know how I can calculate the multiplication of the last two elements of each array and store the results in a map and have the name as a key.

When I try something like

    val res = datArray.map(x => x(4).toInt * x(5).toDouble)

      

it returns a Unit and I can't do anything about it.

I would appreciate it if you can shed some light.

I found something similar in the following link, however it is between two separate arrays which seem to be simpler.

Elementary sum of arrays in Scala

Thank,

Moe

+3


source to share


2 answers


This, of course, does not "return One". res

- array of doublings. You forgot the name, but it's easy to fix:

 val res = datArray.map(x => x(1) -> x(4).toInt * x(5).toDouble)

      

You now have a sequence of tuples, Seq[(String, Double)]

where the first element is the name and the second is the product you are using.



You can do all sorts of things with this list:

  • Convert it to a map name

    β†’ score

    : res.toMap

    (beware: if you have duplicate entries of the same name, will be saved only the last of each set)
  • Find the entry with the lowest score: val (name, score) = res.minBy(_._2)

  • Find the record with the highest score: `val (name, score) = res.maxBy (_._ 2)
  • Find the total of all results: res.map(_._2).sum

  • Find ten records: res.sortBy(-_._2).take(10)

  • Combine the grades with the same name and make a map as in the first item, but with the values ​​being the final grades for each name: res.groupBy(_._1).mapValues(_.map(_._2).sum)

  • etc.
+1


source


val res = datArray.map(x => (x(1), x(4).toInt * x(5).toDouble)).toMap



You were almost there, you returned a list with values. You have to convert it to pairs (by adding a key). The pair looks like (key, value)

and then calls toMap.

0


source







All Articles