PIVOT VIEW using PostgreSQL
I am new to PostgreSQL and am using version 9.4. I have a table with collected dimensions as strings and need to convert it to some sort of PIVOT table using what is always relevant, like VIEW. Also, some values ββneed to be converted, eg. d. multiplied by 1000, as you can see in the example below for "sensor3".
Source table:
CREATE TABLE source (
id bigint NOT NULL,
name character varying(255),
"timestamp" timestamp without time zone,
value character varying(32672),
CONSTRAINT source_pkey PRIMARY KEY (id)
);
INSERT INTO source VALUES
(15,'sensor2','2015-01-03 22:02:05.872','88.4')
, (16,'foo27' ,'2015-01-03 22:02:10.887','-3.755')
, (17,'sensor1','2015-01-03 22:02:10.887','1.1704')
, (18,'foo27' ,'2015-01-03 22:02:50.825','-1.4')
, (19,'bar_18' ,'2015-01-03 22:02:50.833','545.43')
, (20,'foo27' ,'2015-01-03 22:02:50.935','-2.87')
, (21,'sensor3','2015-01-03 22:02:51.044','6.56');
The result of the original table:
| id | name | timestamp | value |
|----+-----------+---------------------------+----------|
| 15 | "sensor2" | "2015-01-03 22:02:05.872" | "88.4" |
| 16 | "foo27" | "2015-01-03 22:02:10.887" | "-3.755" |
| 17 | "sensor1" | "2015-01-03 22:02:10.887" | "1.1704" |
| 18 | "foo27" | "2015-01-03 22:02:50.825" | "-1.4" |
| 19 | "bar_18" | "2015-01-03 22:02:50.833" | "545.43" |
| 20 | "foo27" | "2015-01-03 22:02:50.935" | "-2.87" |
| 21 | "sensor3" | "2015-01-03 22:02:51.044" | "6.56" |
Desired end result:
| timestamp | sensor1 | sensor2 | sensor3 | foo27 | bar_18 |
|---------------------------+---------+---------+---------+---------+---------|
| "2015-01-03 22:02:05.872" | | 88.4 | | | |
| "2015-01-03 22:02:10.887" | 1.1704 | | | -3.755 | |
| "2015-01-03 22:02:50.825" | | | | -1.4 | |
| "2015-01-03 22:02:50.833" | | | | | 545.43 |
| "2015-01-03 22:02:50.935" | | | | -2.87 | |
| "2015-01-03 22:02:51.044" | | | 6560.00 | | |
Using this:
-- CREATE EXTENSION tablefunc;
SELECT *
FROM
crosstab(
'SELECT
source."timestamp",
source.name,
source.value
FROM
public.source
ORDER BY
1'
,
'SELECT
DISTINCT
source.name
FROM
public.source
ORDER BY
1'
)
AS
(
"timestamp" timestamp without time zone,
"sensor1" character varying(32672),
"sensor2" character varying(32672),
"sensor3" character varying(32672),
"foo27" character varying(32672),
"bar_18" character varying(32672)
)
;
I got the result:
| timestamp | sensor1 | sensor2 | sensor3 | foo27 | bar_18 |
|---------------------------+---------+---------+---------+---------+---------|
| "2015-01-03 22:02:05.872" | | | | 88.4 | |
| "2015-01-03 22:02:10.887" | | -3.755 | 1.1704 | | |
| "2015-01-03 22:02:50.825" | | -1.4 | | | |
| "2015-01-03 22:02:50.833" | 545.43 | | | | |
| "2015-01-03 22:02:50.935" | | -2.87 | | | |
| "2015-01-03 22:02:51.044" | | | | | 6.56 |
Unfortunately,
- values ββare not assigned to the correct column,
- columns are not dynamic; this means the request fails when there is an additional entry in the name column such as "sensor4" and
- I don't know how to change the values ββof some columns (multiply).
source to share
Your request works like this:
SELECT * FROM crosstab(
$$SELECT "timestamp", name
, CASE name
WHEN 'sensor3' THEN value::numeric * 1000
-- WHEN 'sensor9' THEN value::numeric * 9000 -- add more ...
ELSE value::numeric END AS value
FROM source
ORDER BY 1, 2$$
,$$SELECT unnest('{bar_18,foo27,sensor1,sensor2,sensor3}'::text[])$$
) AS (
"timestamp" timestamp
, bar_18 numeric
, foo27 numeric
, sensor1 numeric
, sensor2 numeric
, sensor3 numeric);
To multiply value
by the selected columns, use "simple"CASE
. But first you need to apply the numeric type . Usage value::numeric
in this example.
Which begs the question: why not store the value as a numeric type to begin with?
You need to use the two parameter version . Detailed explanation:
Truly dynamic crosstab tables are next to impossible , as SQL requires you to know the result type in advance - at the latest at call time. But you can do something with polymorphic types:
source to share
@Erwin: He said "too long for 7128 characters" for a comment! Anyway:
Your post gave me hints for the right direction, so thanks a lot, but especially in my case, I need it to be really dynamic. I currently have 38886 rows with 49 different elements (= columns to rotate).
First answer your question and the next question @Jasen: The layout of the original table is independent of me, I am already very happy to have this data in the DBMS. If it were for me, I would always keep UTC-timestamps! But there is also a reason for storing data as strings: it can contain various data types like boolean, integer, float, string, etc.
To avoid confusing me further, I created a new demo dataset, the type data prefix (I know some people hate this!) To avoid keyword problems and change the timestamp (-> minutes) for a better overview:
-- --------------------------------------------------------------------------
-- Create demo table of given schema and insert arbitrary data
-- --------------------------------------------------------------------------
DROP TABLE IF EXISTS table_source;
CREATE TABLE table_source
(
column_id BIGINT NOT NULL,
column_name CHARACTER VARYING(255),
column_timestamp TIMESTAMP WITHOUT TIME ZONE,
column_value CHARACTER VARYING(32672),
CONSTRAINT table_source_pkey PRIMARY KEY (column_id)
);
INSERT INTO table_source VALUES ( 15,'sensor2','2015-01-03 22:01:05.872','88.4');
INSERT INTO table_source VALUES ( 16,'foo27' ,'2015-01-03 22:02:10.887','-3.755');
INSERT INTO table_source VALUES ( 17,'sensor1','2015-01-03 22:02:10.887','1.1704');
INSERT INTO table_source VALUES ( 18,'foo27' ,'2015-01-03 22:03:50.825','-1.4');
INSERT INTO table_source VALUES ( 19,'bar_18','2015-01-03 22:04:50.833','545.43');
INSERT INTO table_source VALUES ( 20,'foo27' ,'2015-01-03 22:05:50.935','-2.87');
INSERT INTO table_source VALUES ( 21,'seNSor3','2015-01-03 22:06:51.044','6.56');
SELECT * FROM table_source;
Also, based on the suggestions of @Erwin, I created a view that already converts the data type. It has a nice feature besides being fast, only add the necessary transformations for known elements, but without affecting other (new) elements.
-- --------------------------------------------------------------------------
-- Create view to process source data
-- --------------------------------------------------------------------------
DROP VIEW IF EXISTS view_source_processed;
CREATE VIEW
view_source_processed
AS
SELECT
column_timestamp,
column_name,
CASE LOWER( column_name)
WHEN LOWER( 'sensor3') THEN CAST( column_value AS DOUBLE PRECISION) * 1000.0
ELSE CAST( column_value AS DOUBLE PRECISION)
END AS column_value
FROM
table_source
;
SELECT * FROM view_source_processed ORDER BY column_timestamp DESC LIMIT 100;
This is the desired result of the whole question:
-- --------------------------------------------------------------------------
-- Desired result:
-- --------------------------------------------------------------------------
/*
| column_timestamp | bar_18 | foo27 | sensor1 | sensor2 | seNSor3 |
|---------------------------+---------+---------+---------+---------+---------|
| "2015-01-03 22:01:05.872" | | | | 88.4 | |
| "2015-01-03 22:02:10.887" | | -3.755 | 1.1704 | | |
| "2015-01-03 22:03:50.825" | | -1.4 | | | |
| "2015-01-03 22:04:50.833" | 545.43 | | | | |
| "2015-01-03 22:05:50.935" | | -2.87 | | | |
| "2015-01-03 22:06:51.044" | | | | | 6560 |
*/
This is @ Erwin's solution taken for new data source demo. This is fine as long as the elements (= columns to be rotated) do not change:
-- --------------------------------------------------------------------------
-- Solution by Erwin, modified for changed demo dataset:
-- http://stackoverflow.com/a/27773730
-- --------------------------------------------------------------------------
SELECT *
FROM
crosstab(
$$
SELECT
column_timestamp,
column_name,
column_value
FROM
view_source_processed
ORDER BY
1, 2
$$
,
$$
SELECT
UNNEST( '{bar_18,foo27,sensor1,sensor2,seNSor3}'::text[])
$$
)
AS
(
column_timestamp timestamp,
bar_18 DOUBLE PRECISION,
foo27 DOUBLE PRECISION,
sensor1 DOUBLE PRECISION,
sensor2 DOUBLE PRECISION,
seNSor3 DOUBLE PRECISION
)
;
While reading through @ Erwin's links, I found @Clodoaldo Neto's Dynamic SQL example and remembered that I already did it this way in Transact-SQL; this is my attempt:
-- --------------------------------------------------------------------------
-- Dynamic attempt based on:
-- http://stackoverflow.com/a/12989297/131874
-- --------------------------------------------------------------------------
DO $DO$
DECLARE
list_columns TEXT;
BEGIN
DROP TABLE IF EXISTS temp_table_pivot;
list_columns := (
SELECT
string_agg( DISTINCT column_name, ' ' ORDER BY column_name)
FROM
view_source_processed
);
EXECUTE(
FORMAT(
$format_1$
CREATE TEMP TABLE
temp_table_pivot(
column_timestamp TIMESTAMP,
%1$s
)
$format_1$
,
(
REPLACE(
list_columns,
' ',
' DOUBLE PRECISION, '
) || ' DOUBLE PRECISION'
)
)
);
EXECUTE(
FORMAT(
$format_2$
INSERT INTO temp_table_pivot
SELECT
*
FROM crosstab(
$crosstab_1$
SELECT
column_timestamp,
column_name,
column_value
FROM
view_source_processed
ORDER BY
column_timestamp, column_name
$crosstab_1$
,
$crosstab_2$
SELECT DISTINCT
column_name
FROM
view_source_processed
ORDER BY
column_name
$crosstab_2$
)
AS
(
column_timestamp TIMESTAMP,
%1$s
);
$format_2$
,
REPLACE( list_columns, ' ', ' DOUBLE PRECISION, ')
||
' DOUBLE PRECISION'
)
);
END;
$DO$;
SELECT * FROM temp_table_pivot ORDER BY column_timestamp DESC LIMIT 100;
Also, to get this in a stored procedure, I, for performance reasons, try to move this to a staging table where only new values ββare inserted. I will keep you informed!
Thank!!!
L.
PS: NO, I don't want to answer my own question, but the "comment" field is too small!
source to share