Simplifying data structures and conditional statements in python code

I was wondering if there are ways to simplify the following piece of code. As you can see, numerous dicts are used as well as condition expressions to cut off bad inputs. Note that the shutdown speed values ​​are not all entered yet, the dicts are just copied and pasted for now

EDIT

At any of the speeds (x, y): z. x and y are correct, z values ​​are not the same as they are just copied / pasted

this code works if you want to copy, paste and test it

import math


# step 1.4 return trip rates
def trip_rates( population_stratification, analysis_type, low_income, medium_income, high_income ):
  ''' this function returns the proper trip rate tuple to be used based on input 
    data 
    ADPT = Average Daily Person Trips per Household
    pph = person per household
    veh_hh = vehicles per household
    (param_1, param_2): ADPT
  '''
  li = low_income
  mi = medium_income
  hi = high_income
  # table 5 -
  if analysis_type == 1:
    if population_stratification == 1:
      rates = {( li, 1 ):3.6, ( li, 2 ):6.5, ( li, 3 ):9.1, ( li, 4 ):11.5, ( li, 5 ): 13.8,
               ( mi, 1 ):3.9, ( mi, 2 ):7.3, ( mi, 3 ):10.0, ( mi, 4 ):13.1, ( mi, 5 ): 15.9,
               ( hi, 1 ):4.5, ( mi, 2 ):9.2, ( mi, 3 ):12.2, ( mi, 4 ):14.8, ( mi, 5 ): 18.2}
      return rates
    if population_stratification == 2:
      rates = {
               ( li, 1 ):3.1, ( li, 2 ):6.3, ( li, 3 ):9.4, ( li, 4 ):12.5, ( li, 5 ): 14.7,
               ( mi, 1 ):4.8, ( mi, 2 ):7.2, ( mi, 3 ):10.1, ( mi, 4 ):13.3, ( mi, 5 ): 15.5,
               ( hi, 1 ):4.9, ( mi, 2 ):7.7, ( mi, 3 ):12.5, ( mi, 4 ):13.8, ( mi, 5 ): 16.7
              }
      return rates
    if population_stratification == 3: #TODO: input actual rate
      rates = {
               ( li, 1 ):3.6, ( li, 2 ):6.5, ( li, 3 ):9.1, ( li, 4 ):11.5, ( li, 5 ): 13.8,
               ( mi, 1 ):3.9, ( mi, 2 ):7.3, ( mi, 3 ):10.0, ( mi, 4 ):13.1, ( mi, 5 ): 15.9,
               ( hi, 1 ):4.5, ( mi, 2 ):9.2, ( mi, 3 ):12.2, ( mi, 4 ):14.8, ( mi, 5 ): 18.2
              }
      return rates
    if population_stratification == 4: #TODO: input actual rate
      rates = {
               ( li, 1 ):3.1, ( li, 2 ):6.3, ( li, 3 ):9.4, ( li, 4 ):12.5, ( li, 5 ): 14.7,
               ( mi, 1 ):4.8, ( mi, 2 ):7.2, ( mi, 3 ):10.1, ( mi, 4 ):13.3, ( mi, 5 ): 15.5,
               ( hi, 1 ):4.9, ( mi, 2 ):7.7, ( mi, 3 ):12.5, ( mi, 4 ):13.8, ( mi, 5 ): 16.7
              }
      return rates
  #table 6
  elif analysis_type == 2:
    if population_stratification == 1: #TODO: Change rates
      rates = {
               ( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
               ( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
               ( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
               ( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
              }
      return rates
    if population_stratification == 2: #TODO: Change rates
      rates = {
               ( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
               ( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
               ( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
               ( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
              }
      return rates
    if population_stratification == 3: #TODO: Change rates
      rates = {
               ( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
               ( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
               ( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
               ( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
              }
      return rates
    if population_stratification == 4: #TODO: Change rates
      rates = {
               ( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
               ( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
               ( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
               ( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
              }
      return rates
  # table 7
  elif analysis_type == 3:
    if population_stratification == 1: #TODO: input actual rate
      rates = {
               ( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
               ( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
               ( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
              }
      return rates
    if population_stratification == 2: #TODO: input actual rate
      rates = {
               ( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
               ( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
               ( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
              }
      return rates
    if population_stratification == 3: #TODO: input actual rate
      rates = {
               ( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
               ( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
               ( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
              }
      return rates
    if population_stratification == 4: #TODO: input actual rate
      rates = {
               ( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
               ( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
               ( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
              }
      return rates

def interpolate( population_stratification, analysis_type, low_income, medium_income, high_income, x, y ):
  #get rates dict
  rates = trip_rates( population_stratification, analysis_type, low_income, medium_income, high_income )


  # dealing with x parameters
  #when using income levels, x_1 and x_2 are li, mi, or hi
  if analysis_type == 1 or analysis_type == 2 or analsis_type == 4:
    if x < high_income and x >= medium_income:
      x_1 = medium_income
      x_2 = high_income
    elif x < medium_income:
      x_1 = low_income
      x_2 = medium_income
    else:
      x_1 = high_income
      x_2 = high_income
  if analysis_type == 3:
    if x >= 3:
      x_1 = 3
      x_2 = 3
    else:
      x_1 = int( math.floor( x ) )
      x_2 = int( math.ceil( x ) )

  # dealing with y parametrs
  #when using persons per household, max number y = 5
  if analysis_type == 1 or analysis_type == 4:
    if y >= 5:
      y_1 = 5
      y_2 = 5
    else:
      y_1 = int( math.floor( y ) )
      y_2 = int( math.ceil( y ) )
  elif analysis_type == 2 or analysis_type == 3:
    if y >= 5:
      y_1 = 5
      y_2 = 5
    else:
      y_1 = int( math.floor( y ) )
      y_2 = int( math.ceil( y ) )

  # denominator
  z = ( ( x_2 - x_1 ) * ( y_2 - y_1 ) )

  result = ( ( ( rates[( x_1, y_1 )] ) * ( ( x_2 - x ) * ( y_2 - y ) ) / ( z ) ) +
             ( ( rates[( x_2, y_1 )] ) * ( ( x - x_1 ) * ( y_2 - y ) ) / ( z ) ) +
             ( ( rates[( x_1, y_2 )] ) * ( ( x_2 - x ) * ( y - y_1 ) ) / ( z ) ) +
             ( ( rates[( x_2, y_2 )] ) * ( ( x - x_1 ) * ( y - y_1 ) ) / ( z ) ) )

  return result

#test
low_income = 20000 #this is calculated using exchange rates
medium_income = 40000 # this is calculated using exchange rates
high_income = 60000 # this is calculated using exchange rates
population_stratification = 1 #inputed by user
analysis_type = 1 #inputed by user
x = 35234.34 #test income
y = 3.5 # test pph

print interpolate( population_stratification, analysis_type, low_income, medium_income, high_income, x, y )

      

+2


source to share


2 answers


Well, where to start? Here's just the first note:

You have a lot of data in there and it seems like code and data are mixing with each other.

Data and code must be separate. Data is an external source that you modify or read. You could probably adapt your code to quickly analyze data from a good editable view to a view useful for your algorithms. I suspect that your code will be shorter, clearer, and less error prone (have you noticed that all "rate" dictionaries have multiple keys and you are missing a lot of “hello” keys?).

If you want better abstractions like matrices and data arrays take a look numpy


Edit 1

Do you count the number of measurements? Here you have a multidimensional matrix with dimensions of X: analysis_type, population_stratification, income_level, index

If I understand correctly, this is a 3x4x3x3 (= 108 records) "matrix" or "lookup table". If it's the data your model relies on, great. But can you put these numbers in the file or table you were reading into? Your code will be near trivial.


Edit 2

Ok, I'll bite for some minor python style: Testing values ​​in a Set or Range.

Instead:

if analysis_type == 1 or analysis_type == 2 or analsis_type == 4:

      

you can use



if analysis_type in (1, 2, 4):

      

or even using readable names like (CUBIC, ..) as suggested.

Instead:

if x < high_income and x >= medium_income:

      

you can use constrained conditions; Python is one of the few programming languages ​​where rung conditions make nautral if statements:

if medium_income <= x < high_income:

      


Edit 3

More important than small numbers of code is, of course, code design and refactoring. Edit 2 can only give you a little bit of benefit.

You should learn to hate duplicate code.

Also, you have quite a few branches in one function. This is a good sign that you should break it down into several functions. It can also reduce duplication. For example, when a single type variable analysis_type

can completely change what a function does, why are there two different behaviors in the same function? You don't have to have the whole program in one function. Perhaps analy_type == 3 is better expressed in its own function (as an example)?

Do you realize that your function trip_rates

basically does an array lookup, where the array lookup is hardcoded as if ..: return .. if: return .. and the array is written out entirely by the function? What if trip_rates

you can implement it like this? It would be possible?

data_model = compute_table(low_income, ...)
return data_model[analysis_type][population_stratification]

      

+5


source


Also with the Kaiser's suggestion for data and code, here are some simple fixes:

Code

if y >= 5:
      y_1 = 5
      y_2 = 5
    else:
      y_1 = int( math.floor( y ) )
      y_2 = int( math.ceil( y ) )

      

can be written as

min(5, int(math.floor(y))

      

or



int(math.floor(min(5, y))

      

or even made a function:

def limitedInt(v, maxV):
   return min(5, int(math.floor(y))

      

Also, I would recommend that, instead of saying analysis_type == 1

, you say something like analysis_type = CUBIC

(ie a name that describes the type of parse) and set the name to 1. This is not as simplistic as it is to make the code more readable.

You can find the book Refactoring by Martin Fowler or the William Wake Refactoring Handbook as a way to learn about code cleanup ( the website is also available, but doesn't know about the "smell code" described in the books, it's not that useful.

+2


source







All Articles