Python Pandas: changing the value associated with each entry of the first day of each month

I would like to change the value associated with the first day of every month for pandas.Series

which I have. For example, given something like this:

Date
1984-01-03    0.992701
1984-01-04    1.003614
1984-01-17    0.994647
1984-01-18    1.007440
1984-01-27    1.006097
1984-01-30    0.991546
1984-01-31    1.002928
1984-02-01    1.009894
1984-02-02    0.996608
1984-02-03    0.996595
                ...

      

I would like to change the values ​​associated with 1984-01-03

, 1984-02-01

and so on. I fucked my brain for hours and looked around Stack Overflow honestly. Some solutions came closer. For example using:

[In]: series.groupby((m_ret.index.year, m_ret.index.month)).first()

[Out]:

Date  Date
1984  1       0.992701
      2       1.009894
      3       1.005963
      4       0.997899
      5       1.000342
      6       0.995429
      7       0.994620
      8       1.019377
      9       0.993209
      10      1.000992
      11      1.009786
      12      0.999069
1985  1       0.981220
      2       1.011928
      3       0.993042
      4       1.015153
                ...

      

Almost there, but I'm still trying to continue.

What I would like to do is set the values ​​associated with the first day present in each month for each year to 1.

series[m_ret.index.is_month_start] = 1

is close, but the problem here is that is_month_start

only selects rows where the day value is 1. However, in my series this is not always the case as you can see. For example, the date of the first day in January 1984-01-03

.

series.groupby(pd.TimeGrouper('BM')).nth(0)

also doesn't return the first day, instead I get the last day:

Date
1984-01-31    0.992701
1984-02-29    1.009894
1984-03-30    1.005963
1984-04-30    0.997899
1984-05-31    1.000342
1984-06-29    0.995429
1984-07-31    0.994620
1984-08-31    1.019377
                ...

      

I am completely stumped. Your help is, as always, greatly appreciated! Thank.

+3


source to share


1 answer


One way is to use your idea .groupby((m_ret.index.year, m_ret.index.month))

, but use idxmin

instead on the index itself, converted to a series:



In [74]: s.index.to_series().groupby([s.index.year, s.index.month]).idxmin()
Out[74]: 
Date  Date
1984  1      1984-01-03
      2      1984-02-01
Name: Date, dtype: datetime64[ns]

In [75]: start = s.index.to_series().groupby([s.index.year, s.index.month]).idxmin()

In [76]: s.loc[start] = 999

In [77]: s
Out[77]: 
Date
1984-01-03    999.000000
1984-01-04      1.003614
1984-01-17      0.994647
1984-01-18      1.007440
1984-01-27      1.006097
1984-01-30      0.991546
1984-01-31      1.002928
1984-02-01    999.000000
1984-02-02      0.996608
1984-02-03      0.996595
dtype: float64

      

+3


source







All Articles