What is the best data model for representing mathematical range (in database, xml, json ...)?

eg:

greater than or equal to 50 and less than 100 (> = 50 & <100)

less than 10 or more than 40 (<10 ||> 40)

I was thinking about how to represent the mathematical range in file and database, the range can be entered by a non-programmer and I need the input to be simple, but on the other hand it should also be easy to input the input convert to data and easily check the input errors , for example: "<10 ||> 100" seems to be the simplest, but it's harder for me to parse the string to get the data, I also need to take into account the input format error

I've looked at some input methods using> = 50 && <100 as an example:

1.using 1 line to represent a whole range:

<rangeInString>=50 && < 100</rangeInString>

      

2. Separate 2 lines, one representing the lower bound and the other the upper bound, then parse each line in the program:

<lowerBound> >=50 </lowerBound>
<upperBound> <100 </upperBound>

      

3. Separate the lower and upper borders, also separate the sign from the number:

<lowerBound>
    <sign> >= </sign>
    <data>50</data>
</lowerBound>
<upperBound>
    <sign> < </sign>
    <data>100</data>
</upperBound>

      

4. Separate the lower border and upper border, as well as a separate sign, and also separate the case if if includes an equal condition:

<lowerBound>
    <sign> > </sign>
    <isIncludeEqual>true</isIncludeEqual>
    <data>50</data>
</lowerBound>
<upperBound>
    <sign> < </sign>
    <isIncludeEqual>false</isIncludeEqual>
    <data>100</data>
</upperBound>

      

5.auto detects the use of "& &" or "||" , for example:> = A with <B, if A <B, must be "& &" eg (> = 50 & <100), otherwise it is "||" for example (> = 100 || <50):

<A>
    <sign> > </sign>
    <isIncludeEqual>true</isIncludeEqual>
    <data>50</data>
</A>
<B>
    <sign> < </sign>
    <isIncludeEqual>false</isIncludeEqual>
    <data>100</data>
</B>

      

6. use the "isAnd" field to separate> = 50 && <100 (true) and <= 50 || > 100 (false) instead of using the field sign "<" and ">":

<lowerBound>
    <isIncludeEqual>true</isIncludeEqual>
    <data>50</data>
</lowerBound>
<upperBound>
    <isIncludeEqual>false</isIncludeEqual>
    <data>100</data>
</upperBound>
<isAnd>true</isAnd>

      

7. Another data model ...

I need to think about something:

1.easy for non-programmer to type

2.easy to convert or analyze data into program

3.It is easy to check the error, for example, the parsing line increases the complexity of data conversion and checking the wrong format, there may also be another wrong format, for example: <= 50 &&> 100 cannot be valid, I can enable auto detection with "& & "or" || " by input sign, but this can increase the complexity of the code

Anyone have an idea?

+3


source to share


4 answers


Why "code" it? There is no benefit or need and some hassle to use it.

Just store the end end range values

low_end int,
high_end int,

      

You can then convert these raw values ​​to usable expressions in either SQL or application code. You don't need to consider inclusive values ​​because "n exclusive" === "n inclusive - 1" for the low end and "n exclusive" === "n inclusive + 1" for the high end.

Here's the SQL implementation:

where (low_end is null or col > low_end)
and (high_end is null or col < high_end)

      




If the end-of-range values ​​are to be floating point numbers, you need a little more:

low_end int,
low_inclusive boolean,
high_end int,
high_inclusive boolean,

      

And more code:

where (low_end is null or col > low_end + case when low_inclusive then 0 else 1 end)
and (high_end is null or col < high_end - case when high_inclusive then 0 else 1 end)

      

+3


source


That's a good question, how about a combination of the spacing notation suggested by Gordon and the given symbol for infinity. This, combined with the individual fields (or a parsing algorithm) can accomplish the task of defining any range.

For example, a range (3<x<20)

can be written as (3,20)

. The range (x<=10 || x>30)

can be written as a combination    (-_,10],(30,_)

. Where it _

represents infinity. Or use the actual character infinity symbol , ∞, UnicodeU+221E

.



This path would be clear to those with a mathematical background, I believe, and would provide infinite flexibility.

I hope you find it helpful.

+1


source


PostgreSQL makes a range natively .

The view looks like this:

[low, high)

      

[

or ]

= inclusive

(

or )

= exclusive

Without restrictions it looks like this: [low-value, infinity]

http://www.postgresql.org/docs/9.4/static/rangetypes.html

+1


source


Specifically addressing your options:

  • Why present it in a format that needs to be analyzed? It can be inferred that you store it in a format that your code can parse, but what if you need to access it with another programming language?

  • Same as 1.

  • Come close, but you will need to use borders within a range object that includes && or ||. Also, there is no need for an element that is meant by "bottom" and "top" and can be replaced by an enabled flag like you did in 4.

  • Not necessary

  • An unnecessary abstraction ... it's just a range

  • This might work

Another data model:

The data is structured, so it can work in json, xml, relational, or even as a set of semantic triplets.

0


source







All Articles