PIVOT VIEW using PostgreSQL

Question

PIVOT VIEW using PostgreSQL

I am new to PostgreSQL and am using version 9.4. I have a table with collected dimensions as strings and need to convert it to some sort of PIVOT table using what is always relevant, like VIEW. Also, some values need to be converted, eg. d. multiplied by 1000, as you can see in the example below for "sensor3".

Source table:

CREATE TABLE source (
    id bigint NOT NULL,
    name character varying(255),
    "timestamp" timestamp without time zone,
    value character varying(32672),
    CONSTRAINT source_pkey PRIMARY KEY (id)
);

INSERT INTO source VALUES
  (15,'sensor2','2015-01-03 22:02:05.872','88.4')
, (16,'foo27'  ,'2015-01-03 22:02:10.887','-3.755')
, (17,'sensor1','2015-01-03 22:02:10.887','1.1704')
, (18,'foo27'  ,'2015-01-03 22:02:50.825','-1.4')
, (19,'bar_18' ,'2015-01-03 22:02:50.833','545.43')
, (20,'foo27'  ,'2015-01-03 22:02:50.935','-2.87')
, (21,'sensor3','2015-01-03 22:02:51.044','6.56');

The result of the original table:

| id | name      | timestamp                 | value    |
|----+-----------+---------------------------+----------|
| 15 | "sensor2" | "2015-01-03 22:02:05.872" | "88.4"   |
| 16 | "foo27"   | "2015-01-03 22:02:10.887" | "-3.755" |
| 17 | "sensor1" | "2015-01-03 22:02:10.887" | "1.1704" |
| 18 | "foo27"   | "2015-01-03 22:02:50.825" | "-1.4"   |
| 19 | "bar_18"  | "2015-01-03 22:02:50.833" | "545.43" |
| 20 | "foo27"   | "2015-01-03 22:02:50.935" | "-2.87"  |
| 21 | "sensor3" | "2015-01-03 22:02:51.044" | "6.56"   |

Desired end result:

| timestamp                 | sensor1 | sensor2 | sensor3 | foo27   | bar_18  |
|---------------------------+---------+---------+---------+---------+---------|
| "2015-01-03 22:02:05.872" |         | 88.4    |         |         |         |
| "2015-01-03 22:02:10.887" | 1.1704  |         |         | -3.755  |         |
| "2015-01-03 22:02:50.825" |         |         |         | -1.4    |         |
| "2015-01-03 22:02:50.833" |         |         |         |         | 545.43  |
| "2015-01-03 22:02:50.935" |         |         |         | -2.87   |         |
| "2015-01-03 22:02:51.044" |         |         | 6560.00 |         |         |

Using this:

--    CREATE EXTENSION tablefunc;
SELECT *
    FROM
        crosstab(
            'SELECT
                source."timestamp",
                source.name,
                source.value
            FROM
                public.source
            ORDER BY
                1'
            ,
            'SELECT
                DISTINCT
                source.name
            FROM
                public.source
            ORDER BY
                1'
        )
    AS
        (
            "timestamp" timestamp without time zone,
            "sensor1" character varying(32672),
            "sensor2" character varying(32672),
            "sensor3" character varying(32672),
            "foo27" character varying(32672),
            "bar_18" character varying(32672)
        )
    ;

I got the result:

| timestamp                 | sensor1 | sensor2 | sensor3 | foo27   | bar_18  |
|---------------------------+---------+---------+---------+---------+---------|
| "2015-01-03 22:02:05.872" |         |         |         | 88.4    |         |
| "2015-01-03 22:02:10.887" |         | -3.755  | 1.1704  |         |         |
| "2015-01-03 22:02:50.825" |         | -1.4    |         |         |         |
| "2015-01-03 22:02:50.833" | 545.43  |         |         |         |         |
| "2015-01-03 22:02:50.935" |         | -2.87   |         |         |         |
| "2015-01-03 22:02:51.044" |         |         |         |         | 6.56    |

Unfortunately,

values are not assigned to the correct column,
columns are not dynamic; this means the request fails when there is an additional entry in the name column such as "sensor4" and
I don't know how to change the values of some columns (multiply).

+3

sql postgresql case pivot crosstab

lucas0x7B 04 jan. At 23:08

source to share

2 answers

@Erwin: He said "too long for 7128 characters" for a comment! Anyway:

Your post gave me hints for the right direction, so thanks a lot, but especially in my case, I need it to be really dynamic. I currently have 38886 rows with 49 different elements (= columns to rotate).

First answer your question and the next question @Jasen: The layout of the original table is independent of me, I am already very happy to have this data in the DBMS. If it were for me, I would always keep UTC-timestamps! But there is also a reason for storing data as strings: it can contain various data types like boolean, integer, float, string, etc.

To avoid confusing me further, I created a new demo dataset, the type data prefix (I know some people hate this!) To avoid keyword problems and change the timestamp (-> minutes) for a better overview:

--  --------------------------------------------------------------------------
--  Create demo table of given schema and insert arbitrary data
--  --------------------------------------------------------------------------

    DROP TABLE IF EXISTS table_source;

    CREATE TABLE table_source
    (
        column_id BIGINT NOT NULL,
        column_name CHARACTER VARYING(255),
        column_timestamp TIMESTAMP WITHOUT TIME ZONE,
        column_value CHARACTER VARYING(32672),
        CONSTRAINT table_source_pkey PRIMARY KEY (column_id)
    );

    INSERT INTO table_source VALUES ( 15,'sensor2','2015-01-03 22:01:05.872','88.4');
    INSERT INTO table_source VALUES ( 16,'foo27' ,'2015-01-03 22:02:10.887','-3.755');
    INSERT INTO table_source VALUES ( 17,'sensor1','2015-01-03 22:02:10.887','1.1704');
    INSERT INTO table_source VALUES ( 18,'foo27' ,'2015-01-03 22:03:50.825','-1.4');
    INSERT INTO table_source VALUES ( 19,'bar_18','2015-01-03 22:04:50.833','545.43');
    INSERT INTO table_source VALUES ( 20,'foo27' ,'2015-01-03 22:05:50.935','-2.87');
    INSERT INTO table_source VALUES ( 21,'seNSor3','2015-01-03 22:06:51.044','6.56');

    SELECT * FROM table_source;

Also, based on the suggestions of @Erwin, I created a view that already converts the data type. It has a nice feature besides being fast, only add the necessary transformations for known elements, but without affecting other (new) elements.

--  --------------------------------------------------------------------------
--  Create view to process source data
--  --------------------------------------------------------------------------

    DROP VIEW IF EXISTS view_source_processed;

    CREATE VIEW
        view_source_processed
    AS
        SELECT
            column_timestamp,
            column_name,
            CASE LOWER( column_name)
                WHEN LOWER( 'sensor3') THEN CAST( column_value AS DOUBLE PRECISION) * 1000.0
                ELSE CAST( column_value AS DOUBLE PRECISION)
            END AS column_value
        FROM
            table_source
    ;

    SELECT * FROM view_source_processed ORDER BY column_timestamp DESC LIMIT 100;

This is the desired result of the whole question:

--  --------------------------------------------------------------------------
--  Desired result:
--  --------------------------------------------------------------------------

/*
| column_timestamp          | bar_18  | foo27   | sensor1 | sensor2 | seNSor3 |
|---------------------------+---------+---------+---------+---------+---------|
| "2015-01-03 22:01:05.872" |         |         |         |    88.4 |         |
| "2015-01-03 22:02:10.887" |         |  -3.755 |  1.1704 |         |         |
| "2015-01-03 22:03:50.825" |         |    -1.4 |         |         |         |
| "2015-01-03 22:04:50.833" |  545.43 |         |         |         |         |
| "2015-01-03 22:05:50.935" |         |   -2.87 |         |         |         |
| "2015-01-03 22:06:51.044" |         |         |         |         |    6560 |
*/

This is @ Erwin's solution taken for new data source demo. This is fine as long as the elements (= columns to be rotated) do not change:

--  --------------------------------------------------------------------------
--  Solution by Erwin, modified for changed demo dataset:
--  http://stackoverflow.com/a/27773730
--  --------------------------------------------------------------------------

SELECT *
    FROM
        crosstab(
            $$
                SELECT
                    column_timestamp,
                    column_name,
                    column_value
                FROM
                    view_source_processed
                ORDER BY
                    1, 2
            $$
        ,
            $$
                SELECT
                    UNNEST( '{bar_18,foo27,sensor1,sensor2,seNSor3}'::text[])
            $$
        )
    AS
        (
            column_timestamp timestamp,
            bar_18  DOUBLE PRECISION,
            foo27   DOUBLE PRECISION,
            sensor1 DOUBLE PRECISION,
            sensor2 DOUBLE PRECISION,
            seNSor3 DOUBLE PRECISION
        )
    ;

While reading through @ Erwin's links, I found @Clodoaldo Neto's Dynamic SQL example and remembered that I already did it this way in Transact-SQL; this is my attempt:

--  --------------------------------------------------------------------------
--  Dynamic attempt based on:
--  http://stackoverflow.com/a/12989297/131874
--  --------------------------------------------------------------------------

DO $DO$

DECLARE
    list_columns TEXT;

    BEGIN

        DROP TABLE IF EXISTS temp_table_pivot;

        list_columns := (
            SELECT
                string_agg( DISTINCT column_name, ' ' ORDER BY column_name)
            FROM
                view_source_processed
        );

        EXECUTE(
            FORMAT(
                $format_1$
                CREATE TEMP TABLE
                    temp_table_pivot(
                        column_timestamp TIMESTAMP,
                        %1$s
                    )
                $format_1$
            ,
                (
                    REPLACE(
                        list_columns,
                        ' ',
                        ' DOUBLE PRECISION, '
                    ) || ' DOUBLE PRECISION'
                )
            )
        );

        EXECUTE(
            FORMAT(
                $format_2$
                    INSERT INTO temp_table_pivot
                        SELECT
                            *
                        FROM crosstab(
                            $crosstab_1$
                            SELECT
                                column_timestamp,
                                column_name,
                                column_value
                            FROM
                                view_source_processed
                            ORDER BY
                                column_timestamp, column_name
                            $crosstab_1$
                        ,
                            $crosstab_2$
                            SELECT DISTINCT
                                column_name
                            FROM
                                view_source_processed
                            ORDER BY
                                column_name
                            $crosstab_2$
                        )
                        AS
                        (
                            column_timestamp TIMESTAMP,
                            %1$s
                        );
                $format_2$
            ,
                REPLACE( list_columns, ' ', ' DOUBLE PRECISION, ')
                ||
                ' DOUBLE PRECISION'
            )
        );

    END;

$DO$;

SELECT * FROM temp_table_pivot ORDER BY column_timestamp DESC LIMIT 100;

Also, to get this in a stored procedure, I, for performance reasons, try to move this to a staging table where only new values are inserted. I will keep you informed!

Thank!!!

L.

PS: NO, I don't want to answer my own question, but the "comment" field is too small!

+1

lucas0x7B 05 jan. 15 at 22:21

source to share

Erwin Brandstetter · Accepted Answer · 2015-01-05T05:07:42+0000

Your request works like this:

SELECT * FROM crosstab(
  $$SELECT "timestamp", name
         , CASE name
           WHEN 'sensor3' THEN value::numeric * 1000
       --  WHEN 'sensor9' THEN value::numeric * 9000  -- add more ...
           ELSE value::numeric END AS value
    FROM   source
    ORDER  BY 1, 2$$
 ,$$SELECT unnest('{bar_18,foo27,sensor1,sensor2,sensor3}'::text[])$$
) AS (
  "timestamp" timestamp
, bar_18  numeric
, foo27   numeric
, sensor1 numeric
, sensor2 numeric
, sensor3 numeric);

To multiply value

by the selected columns, use "simple"CASE

. But first you need to apply the numeric type . Usage value::numeric

in this example.
Which begs the question: why not store the value as a numeric type to begin with?

You need to use the two parameter version . Detailed explanation:

PostgreSQL Cross-Tab Query

Truly dynamic crosstab tables are next to impossible , as SQL requires you to know the result type in advance - at the latest at call time. But you can do something with polymorphic types:

Dynamic alternative for pivoting with CASE and GROUP BY

PIVOT VIEW using PostgreSQL

More articles: