Check if date is valid in PostgreSQL source
I have a remote PG data source where I cannot create custom functions. I have to dump PG data to Unicode tables on MSSQL Server 2008 R2. I need a built-in PostgreSQL statement that replaces the invalid date with something like "1900-01-01".
I did an extensive google search to no avail ... thanks.
source to share
PostgreSQL has a much wider range for timestamps than SQL Server for datetime. In PostgreSQL, "0214-06-19 00:00:00" is a valid timestamp. So "0214-06-19 00:00:00 BC".
It's not clear to me if the result should be a date or a timestamp. But this shows how you should probably approach the problem in PostgreSQL
with data (ts) as (
values (timestamp '0214-06-19 00:00:00'), ('1900-01-01 08:00')
)
select
ts as source_ts,
case when ts < timestamp '1900-01-01' then timestamp '1900-01-01'
else ts
end as altered_ts
from data;
source_ts altered_ts - 0214-06-19 00:00:00 1900-01-01 00:00:00 1900-01-01 08:00:00 1900-01-01 08:00:00
Assuming every date before 1900 should be 1900-01-01 is risky. The value "0214-06-19" is probably a typo for 2014-06-19.
source to share
More complex algorithms in SQL are easier to write (and read) with CTEs:
with
syntax as (
select str,
str ~ '^\d{4}-\d{2}-\d{2}$' as syntax_ok,
split_part(str, '-', 1) as syy,
split_part(str, '-', 2) as smm,
split_part(str, '-', 3) as sdd
from test_date),
toint as (
select *,
case when syntax_ok then syy::int else 1900 end yy,
case when syntax_ok then smm::int else 1 end mm,
case when syntax_ok then sdd::int else 1 end dd
from source),
semantics as (
select *,
case
when mm in (1,3,5,7,8,10,12) then dd between 1 and 31
when mm in (4,6,9,11) then dd between 1 and 30
when mm = 2 then
case when yy/4*4 = yy and (yy/100*100 <> yy or yy/400*400 = yy)
then dd between 1 and 29
else dd between 1 and 28 end
else false end as valid
from toint),
result as (
select str,
case when syntax_ok and valid then str
else '1900-01-01' end as date
from semantics)
select * from result
The first query checks the syntax and splits str
into three parts, the second query distinguishes the parts as integers, and the third query checks the semantics.
SQLFiddle .
Your comment shows that you don't need such careful checks, but I think this query can be easily customized to your liking.
source to share