Replace the previous string value if the value is 0 in R
I want to check if the value of a column in the dataframe is 0, and if it is 0, I want to substitute the previous row value of that column.
Let's assume my data is like this:
X
683 24.48
684 24.48
685 24.48
686 24.48
687 24.48
688 24.48
689 25.02
690 25.02
691 25.02
692 25.02
693 25.02
694 25.02
695 25.02
696 25.02
697 25.32
698 25.32
699 25.32
700 0.00
701 0.00
702 0.00
703 0.00
704 0.00
705 0.00
706 0.00
707 0.00
708 0.00
709 0.00
710 0.00
711 0.00
712 0.00
713 0.00
714 0.00
715 0.00
716 0.00
717 0.00
718 0.00
719 0.00
720 0.00
721 0.00
722 0.00
723 0.00
724 0.00
725 0.00
726 0.00
727 0.00
728 0.00
729 0.00
730 0.00
731 0.00
732 0.00
733 0.00
734 0.00
735 0.00
736 0.00
737 0.00
738 0.00
739 0.00
740 0.00
741 0.00
742 0.00
743 0.00
744 0.00
745 0.00
746 0.00
747 0.00
748 0.00
749 0.00
750 0.00
751 0.00
752 0.00
753 0.00
754 0.00
755 0.00
756 0.00
757 0.00
758 0.00
759 0.00
760 0.00
761 0.00
762 0.00
763 0.00
764 0.00
765 0.00
766 0.00
767 0.00
768 0.00
769 0.00
770 0.00
771 0.00
772 0.00
773 0.00
774 0.00
775 0.00
776 0.00
777 0.00
778 0.00
779 0.00
780 0.00
781 0.00
782 0.00
783 0.00
784 0.00
785 0.00
786 0.00
787 0.00
788 0.00
789 0.00
790 0.00
791 0.00
792 0.00
793 0.00
794 0.00
795 0.00
796 0.00
797 0.00
798 0.00
799 0.00
800 0.00
801 0.00
802 0.00
803 0.00
804 0.00
805 0.00
806 0.00
807 0.00
808 0.00
809 0.00
810 0.00
811 0.00
812 0.00
813 0.00
814 0.00
815 0.00
816 0.00
817 0.00
818 0.00
819 0.00
820 0.00
821 0.00
822 0.00
823 0.00
824 0.00
825 0.00
826 0.00
827 0.00
828 0.00
829 0.00
830 0.00
831 0.00
832 0.00
833 0.00
834 0.00
835 0.00
836 0.00
837 0.00
838 0.00
839 0.00
840 0.00
841 0.00
842 0.00
843 0.00
844 0.00
845 0.00
846 0.00
847 0.00
848 0.00
849 0.00
850 0.00
851 0.00
852 0.00
853 0.00
854 0.00
855 0.00
856 0.00
857 0.00
858 0.00
859 0.00
860 0.00
861 0.00
862 0.00
863 0.00
864 0.00
865 0.00
866 0.00
867 0.00
868 0.00
869 0.00
870 0.00
871 0.00
872 0.00
873 0.00
874 0.00
875 0.00
876 0.00
877 0.00
878 0.00
879 0.00
880 0.00
881 0.00
882 0.00
883 0.00
884 0.00
885 0.00
886 0.00
887 0.00
888 0.00
889 0.00
890 0.00
891 0.00
892 0.00
893 0.00
894 0.00
895 0.00
896 0.00
897 0.00
898 0.00
899 0.00
900 0.00
901 0.00
902 0.00
903 0.00
904 0.00
905 0.00
906 0.00
907 0.00
908 0.00
909 0.00
910 0.00
911 0.00
912 0.00
913 0.00
914 0.00
915 0.00
916 0.00
917 0.00
918 0.00
919 0.00
920 0.00
921 0.00
922 0.00
923 0.00
924 0.00
925 0.00
926 0.00
927 0.00
928 0.00
929 0.00
930 0.00
931 0.00
932 0.00
933 0.00
934 0.00
935 0.00
936 0.00
937 0.00
938 0.00
939 0.00
940 0.00
941 0.00
942 0.00
943 0.00
944 0.00
945 0.00
946 0.00
947 0.00
948 0.00
949 0.00
950 0.00
951 0.00
952 0.00
953 0.00
954 0.00
955 0.00
956 0.00
957 0.00
958 0.00
959 0.00
960 0.00
961 0.00
962 0.00
963 0.00
964 0.00
965 0.00
966 0.00
967 0.00
968 0.00
969 0.00
970 0.00
971 0.00
972 0.00
973 0.00
974 0.00
975 0.00
976 0.00
977 0.00
978 0.00
979 0.00
980 0.00
981 0.00
982 0.00
983 0.00
984 0.00
985 0.00
986 0.00
987 0.00
988 0.00
989 0.00
990 0.00
991 0.00
992 0.00
993 0.00
994 0.00
995 0.00
996 0.00
997 0.00
998 0.00
999 0.00
1000 0.00
1001 0.00
1002 0.00
1003 0.00
1004 0.00
1005 0.00
1006 0.00
1007 0.00
1008 0.00
1009 0.00
1010 0.00
1011 0.00
1012 0.00
1013 0.00
1014 0.00
1015 0.00
1016 0.00
1017 0.00
1018 0.00
1019 0.00
1020 0.00
1021 0.00
1022 0.00
1023 0.00
1024 0.00
1025 0.00
1026 0.00
1027 0.00
1028 0.00
1029 0.00
1030 0.00
1031 0.00
1032 0.00
1033 0.00
1034 0.00
1035 0.00
1036 0.00
1037 0.00
1038 0.00
1039 0.00
1040 0.00
1041 0.00
1042 0.00
1043 0.00
1044 0.00
1045 0.00
1046 0.00
1047 0.00
1048 0.00
1049 0.00
1050 0.00
1051 0.00
1052 0.00
1053 0.00
1054 0.00
1055 0.00
1056 0.00
1057 0.00
1058 0.00
1059 0.00
1060 0.00
1061 0.00
1062 0.00
1063 0.00
1064 0.00
1065 0.00
1066 0.00
1067 0.00
1068 0.00
1069 0.00
1070 0.00
1071 0.00
1072 0.00
1073 0.00
1074 0.00
1075 0.00
1076 0.00
1077 0.00
1078 0.00
1079 0.00
1080 0.00
1081 0.00
1082 0.00
1083 0.00
1084 0.00
1085 0.00
1086 0.00
1087 0.00
1088 0.00
1089 0.00
1090 0.00
1091 0.00
1092 0.00
1093 0.00
1094 0.00
1095 0.00
1096 0.00
1097 0.00
1098 0.00
1099 0.00
1100 0.00
1101 0.00
1102 0.00
1103 0.00
1104 0.00
1105 0.00
1106 0.00
1107 0.00
1108 0.00
1109 0.00
1110 0.00
1111 0.00
1112 0.00
1113 0.00
1114 0.00
1115 0.00
1116 0.00
1117 0.00
1118 0.00
1119 0.00
1120 0.00
1121 0.00
1122 0.00
1123 0.00
1124 0.00
1125 0.00
1126 0.00
1127 0.00
1128 0.00
1129 0.00
1130 0.00
1131 0.00
1132 0.00
1133 0.00
1134 0.00
1135 0.00
1136 0.00
1137 0.00
1138 0.00
1139 0.00
1140 0.00
1141 0.00
1142 0.00
1143 0.00
1144 0.00
1145 0.00
1146 0.00
1147 0.00
1148 0.00
1149 0.00
1150 0.00
1151 0.00
1152 0.00
1153 0.00
1154 0.00
1155 0.00
1156 0.00
1157 0.00
1158 0.00
1159 0.00
1160 0.00
1161 0.00
1162 0.00
1163 0.00
1164 0.00
1165 0.00
1166 0.00
1167 0.00
1168 0.00
1169 0.00
1170 0.00
1171 0.00
1172 0.00
1173 0.00
1174 0.00
1175 0.00
1176 0.00
1177 0.00
1178 0.00
1179 0.00
1180 0.00
1181 0.00
1182 0.00
1183 0.00
1184 0.00
1185 0.00
1186 0.00
1187 0.00
1188 0.00
1189 0.00
1190 0.00
1191 0.00
1192 0.00
1193 0.00
1194 0.00
1195 0.00
1196 0.00
1197 0.00
1198 26.16
1199 26.16
1200 26.16
1201 26.34
Here at 0s, I want a value of 25.32. The fact is that initially my data starts at 0.
How can we do this in R?
Any help would be appreciated.
thank
source to share
Here's a more general solution that replaces any sequence of zeros with the previous non-zero value:
# example
DF <- data.frame(a=c(0,2,0,0,3,0))
# a
# 1 0
# 2 2
# 3 0
# 4 0
# 5 3
# 6 0
# replacement
library(zoo)
is.na(DF$a) <- DF$a == 0
DF$a <- na.locf(DF$a,na.rm=FALSE)
# a
# 1 NA
# 2 2
# 3 2
# 4 2
# 5 3
# 6 3
You will need to install the zoo package. This approach overwrites the zeros with NA
, and then replaces everything NA
by taking the last observation forward (LOCF).
It's a bad idea if you have a legitimate NA
one before the process starts and you don't want to be treated this way. You can check this by looking table(DF$a,useNA="always")
.
To keep leading zeros. To store the first string of zeros as zeros, one could do
DF$a <- na.locf(c(0,DF$a), na.rm=FALSE)[-1]
again after the pattern suggested by @ A.Webb. Use 0L
if DF$a
is an integer vector.
For non-negative, increasing variables. In this special case, @akrun's suggestion will work:
DF$a <- cummax(DF$a)
It is concise and does not require the use of a package.
source to share
This approach should work (as suggested by @ A.Webb):
# example
DF <- data.frame(a=c(0,2,0,3))
# a
# 1 0
# 2 2
# 3 0
# 4 3
# replacement
w <- which(DF$a==0)
DF$a <- replace(DF$a, w, c(NA,DF$a)[w])
# a
# 1 NA
# 2 2
# 3 2
# 4 3
This approach does not fill in the strings of zeros (not originally mentioned by the OP), which I think require a completely different approach (posted in a separate answer).
source to share
> Matrix <- as.data.frame(matrix(data = 2, nrow = 4, ncol = 2))
> Matrix
V1 V2
1 2 2
2 2 2
3 2 2
4 2 2
> Matrix[2,1] = 0
> Matrix
V1 V2
1 2 2
2 0 2
3 2 2
4 2 2
> for(i in 1:3)
+ for(j in 1:2)
+ if(Matrix[i,j] == 0)
+ Matrix[i,j] = Matrix[(i-1),j]
> Matrix
V1 V2
1 2 2
2 2 2
3 2 2
4 2 2
Can't answer, so I'm not sure what you mean, but if you're wondering how to query the dataframe for a null value and replace it with the value from the next line. This approach is only useful for small data frames. If you have a dataframe with more than 1000 cells, I would suggest using the match () statement where you have a loop over the columns and do something like this "> Match <-match (0, Matrix [, i])" ... This will indicate the location on the line where the match occurred and you can implement some of the above code. Hope this helps.
source to share