How can I simplify the function to find out if a string contains coordinates that describe a polygon?
I have the following line:
points = "34.09352 -118.27483, 34.0914 -118.2758, 34.082 -118.2782, 34.0937 -118.2769, 34.0933 -118.2748"
points
is a string containing coordinate values ββ(latitude and longitude), separated by commas.
I want to check that this string only contains integers or float values ββand the first coordinate is equal to the last one.
I have the following code for this:
def validate_points(points):
coordinates = points.split(',')
for point in coordinates:
latlon = point.split(' ')
latitude = latlon[0]
longitude = latlon[1]
if not is_number(latitude) or not is_number(longitude):
raise WrongRequestDataError("Please, specify the correct type of points value. It must be a numeric value")
first = coordinates[0]
last = coordinates[len(coordinates) - 1]
if first != last:
raise WrongRequestDataError("Incorrect points format, the first point must be equal to last")
def is_number(s):
try:
if float(s) or int(s):
return True
except ValueError:
return False
Is there a way to simplify or speed up this code?
Your entrance almost looks like a WKT polygon .
Using the package shapely
, you can simply try to parse the points as WKT and see what happens, according to Python's "It's easier to ask for forgiveness than permission" :
# pip install shapely
from shapely import wkt
def is_well_defined_polygon(points):
try:
wkt.loads("POLYGON((%s))" % points)
return True
except:
return False
points = "34.09352 -118.27483, 34.0914 -118.2758, 34.082 -118.2782, 34.0937 -118.2769, 34.0933 -118.2748, 34.09352 -118.27483"
print(is_well_defined_polygon(points))
# True
print(is_well_defined_polygon("1 2, 3 4"))
# IllegalArgumentException: Points of LinearRing do not form a closed linestring
# False
print(is_well_defined_polygon("a b c d"))
# ParseException: Expected number but encountered word: 'a'
# False
source to share
Here are some improvements. You can speed up the is_number function a little and use coordinates[-1]
[len (coordinates) -1] instead of `coordinates. You also don't have to define all of these variables:
def validate_points(points):
coordinates = points.split(',')
for point in coordinates:
latitude, longitude = point.split(' ', 1)
if not is_number(latitude) or not is_number(longitude):
raise WrongRequestDataError("Please, specify the correct type of points value. It must be a numeric value")
if coordinates[0] != coordinates[- 1]:
raise WrongRequestDataError("Incorrect points format, the first point must be equal to last")
def is_number(s):
try:
return (float(s) or int(s) or True)
except ValueError:
return False
source to share
Minor things:
- Use
coordinates[-1]
insteadcoordinates[len(coordinates)-1]
- Use
latitude, longitude = point.split(' ', 1)
. This will invalidate cases like3.41 47.11 foobar
. - Do you really need
latitude
andlongitude
for lines? You probably want a float / int value, so itis_number
should be something likedef conv_number (s): try: return float (s) except ValueError: try: return int (s) except ValueError: raise WrongRequestDataError (s)
I especially like that you don't use isinstance
float / int for validation: in python you should always be able to pass in an arbitrary object that acts like int
or float
if asked to do so.
source to share
This is how I would do it:
points = "34.09352 -118.27483, 34.0914 -118.2758, 34.082 -118.2782, 34.0937 -118.2769, 34.0933 -118.2748"
def validate_points(points):
separate = points.split(',')
try:
[float(y) for x in separate for y in x.split()]
except ValueError:
return False
return separate[0] == separate[-1]
print(validate_points(points)) # False
If you really want to raise a bug, you can change / simplify the code like this:
def validate_points(points):
separate = points.split(',')
[float(y) for x in separate for y in x.split()] # orphan list-comprehension
if not separate[0] == separate[-1]:
raise ValueError
source to share
my solution using Regex named group to filter data:
# -*- coding: utf-8 -*-
import re
class WrongRequestDataError(Exception):
pass
def position_equal(pos1, pos2):
# retrun pos1 == pos2 # simple compare
accuracy = 0.005
return (
abs(float(pos1['latitude']) - float(pos2['latitude'])) <= accuracy and
abs(float(pos1['longitude']) - float(pos2['longitude'])) <= accuracy
)
test_str = "34.09352 -118.27483, 34.0914 -118.2758, 34.082 -118.2782, 34.0937 -118.2769, 34.0933 -118.2748"
regex = r"(?P<position>(?P<latitude>\-?\d+(\.\d+)?) (?P<longitude>\-?\d+(\.\d+)?))"
matches = re.finditer(regex, test_str, re.IGNORECASE)
matched = []
for matchNum, match in enumerate(matches):
matched.append({
'latitude': match.group('latitude'),
'longitude': match.group('longitude'),
})
matched_count = len(matched)
if matched_count != test_str.count(',') + 1:
raise WrongRequestDataError("Please, specify the correct type of points value. It must be a numeric value")
else:
if matched_count > 1:
if not position_equal(matched[0], matched[-1]):
raise WrongRequestDataError("Incorrect points format, the first point must be equal to last")
You can change the precision value in the position_equal function to change the precision when comparing the first and last position.
You can check or debug regex in regex101: https://regex101.com/r/tYYJXN/1/
source to share