Select nearest neighbors

consider the following data:

category | index | value
-------------------------
cat 1    | 1     | 2
cat 1    | 2     | 3
cat 1    | 3     |  
cat 1    | 4     | 1
cat 2    | 1     | 5
cat 2    | 2     |  
cat 2    | 3     |  
cat 2    | 4     | 6
cat 3    | 1     |  
cat 3    | 2     |  
cat 3    | 3     | 2 
cat 3    | 4     | 1

      

I'm trying to fill the holes so that hole = avg(value)

of the two nearest neighbors with non-zero values ​​in the category:

category | index | value
-------------------------
cat 1    | 1     | 2
cat 1    | 2     | 3
cat 1    | 3     | 2*
cat 1    | 4     | 1
cat 2    | 1     | 5
cat 2    | 2     | 5.5*
cat 2    | 3     | 5.5* 
cat 2    | 4     | 6
cat 3    | 1     | 1.5*
cat 3    | 2     | 1.5* 
cat 3    | 3     | 2 
cat 3    | 4     | 1

      

I'm playing around with window functions and I'm sure this can be achieved, but the solution eludes me.

Any ideas?

+3


source to share


3 answers


You are right, the window function is what you are looking for. Here's how it can be done (the part is with

used to define the table, so you probably won't need it):

with dt as
(
    select * from
    (
        values
            ('cat 1', 1, 2),
            ('cat 1', 2, 3),
            ('cat 1', 3, null),
            ('cat 1', 4, 1),
            ('cat 2', 1, 5),
            ('cat 2', 2, null),
            ('cat 2', 3, null),
            ('cat 2', 4, 6),
            ('cat 3', 1, null),
            ('cat 3', 2, null),
            ('cat 3', 3, 1),
            ('cat 3', 4, 2)

    ) tbl ("category", "index", "value")
)
select
        "category",
        "index",
        case
            when "value" is null then (avg("value") over (partition by "category") )
            else "value"
        end
    from dt
    order by "category", "index";

      



refer to a section of WINDOW Clause

this page for more information on window functions.

+1


source


I worked on a solution for you, but the SQLfiddle is giving (internal) errors at the moment, so I cannot complete it.

A statement like this should do the update for you:



update table1 as t1
set value = 
  (select avg(value)
   from 
   (select value
    from table1 as t3
    where t1.category = t3.category
    and   t3.index in (t1.index - 1, t1.index + 1)
    ) AS T2
   )
where value is null
;

      

The script I was working on is here: http://sqlfiddle.com/#!15/acbc2/1

-1


source


While I'm sure it is possible to do some terribly complex and nested assertion that does what you want, I meant to say that sometimes it's better to write a script in a regular programming language like python / ruby ​​/ java that does iterate over the DB table and make any changes you want.

It will be much more maintainable and you will want to rebuild the whole thing every time you need to make any changes to it (for example, use 3 nearest neighbors instead or change the definition of "nearest neighbor")

-2


source







All Articles