o TcS@sddlZddlZddlZddlZddlZddlZed ddZ ddZ ddZ d d Z d d Z d dZddZddZddZddZddZdS)Nignorec7 Cstt|dddd}t|ddd} t|dd} ttd|dttd|dttd|dttd|dd} d D]} | | d } | | d }t| d krt|d kr|D]j}tt|d d dd}t|d dd}t|dd}tj|| | |||d}|}t|d}t|dd}t|dd}t| d t| d t| d }| | d  |qft|d krLt| d krL| D]j}tt|d d dd}t|d dd}t|dd}tj|| | |||d}|}t|d}t|dd}t|dd}t| d t| d t| d } | | d  | qqIdD]K} | d| d}!| d| d}"t|!d kr}t|"d kr}| d| d| d| d<t|"d krt|!d kr| d| d| d| d<qOi}#d| d| fD]} | | d}$t|$d krQt |$}%t |$}&|%|&kst|$dkrtjdd}'n^g}(tt|$D]<})|$|)|&kr|( tjddtj|$|)dtjddq|( tj|$|)ddtj|$|)dqt|(}*t|*|*d kr,|*d }'nt |*}'tj||%d}+tj||&d},t|+|,|'|'tj}-ng}-|-|#| d<q|dkrcd}.n|dkrjd}.tjtt|ddd| | d}/|.dkrd|/}/d D]} |#| d| d|/|#| d|. d<q|#d| d}-|#d}0g}1g}2|-D]}3|3}4|4jd|4jd|4j}5|1 |3|5dq|#d}0g}2|0D] }6|2 |6d q|1|2fS)!a! Create arrays of requested dates plotting and dates expected to be in MET .stat files Args: date_type - string of describing the treatment of dates, either VALID or INIT date_beg - string of beginning date, either blank or %Y%m%d format date_end - string of end date, either blank or %Y%m%d format fcst_valid_hour - string of forecast valid hour(s) information, blank or in %H%M%S fcst_init_hour - string of forecast init hour(s) information, blank or in %H%M%S obs_valid_hour - string of observation valid hour(s) information, blank or in %H%M%S obs_init_hour - string of observation hour(s) information, blank or in %H%M%S lead - string of forecast lead, in %H%M%S format Returns: plot_time_dates - array of ordinal dates based on user provided information expected_stat_file_dates - array of dates that are expected to be found in the MET .stat files based on user provided information, formatted as %Y%m%d_%H%M%S Ni<z, )Zfcst_valid_timeZfcst_init_timeZobs_valid_timeZ obs_init_time)ZfcstobsZ _valid_timeZ _init_timer)seconds)validinitZfcst__timeZobs_iQZ235959z%H%M%Sz %Y%m%d%H%M%SZ_datesZVALIDZINIT_Zfcst_valid_datesg@z %Y%m%d_%H%M%S)intlistfiltersplitlendatetime timedelta total_secondsstrzfillappendlowerminmaxrangestrptimenparrayallarangeastypetimehourminutesecond toordinalstrftime)7Z date_typeZdate_begZdate_endZfcst_valid_hourZfcst_init_hourZobs_valid_hourZ obs_init_hourleadZlead_hour_secondsZlead_min_secondsZ lead_secondsZvalid_init_time_infotypeZvalid_time_listZinit_time_listZitimeZitime_hour_secondsZitime_min_secondsZ itime_secondsoffsetZtot_secZ valid_hourZ valid_minZ valid_secZ valid_timeZvtimeZvtime_hour_secondsZvtime_min_secondsZ vtime_secondsZ init_hourZinit_minZinit_secZ init_timeZfcst_time_listZ obs_time_listZ date_infoZ time_listZtime_begZtime_endZdelta_tZ delta_t_listtZ delta_t_arrayZbegenddatesZoppo_date_typeZlead_timedeltaZfv_datesZplot_time_datesZexpected_stat_file_datesdatedtr Zfv_dater4D/lfs/h2/emc/vpppg/save/jiayi.peng/tc_verify/ush/plot_tropcyc_util.pyget_date_arrays s<!                            r6cCs|d}d}d}|D]}|dkrq |}dD]}||vr$|}||d}q|dvr6|d|7}|d|7}q |dvrG|d|7}|d |7}q |d vrX|d |7}|d |7}q |d vri|d|7}|d|7}q |dvrz|d|7}|d|7}q |dvr|d|7}|d|7}q ||fS)aJ! Format thresholds for file naming Args: thresh - string of the treshold(s) Return: thresh_symbol - string of the threshold(s) with symbols thresh_letters - string of the threshold(s) with letters  ) >=>==!=<=<gegteqnelelt)r:r@r:r@)r9r?r9r?)r>rDr>rD)r=rCr=rC)r;rAr;rA)r<rBr<rB)rreplace)ZthreshZ thresh_listZ thresh_symbolZ thresh_letterZ thresh_valueoptZ thresh_optr4r4r5 format_threshs@         rGcCs(t|}|dkrgd}|Sgd}|S)a! Get the standard MET .stat file columns based on version number Args: met_version - string of MET version number being used to run stat_analysis Returns: stat_file_base_columns - list of the standard columns shared among the different line types g333333 @)VERSIONMODELDESC FCST_LEADFCST_VALID_BEGFCST_VALID_ENDOBS_LEAD OBS_VALID_BEG OBS_VALID_ENDFCST_VARFCST_LEVOBS_VAROBS_LEVOBTYPEVX_MASK INTERP_MTHD INTERP_PNTS FCST_THRESH OBS_THRESH COV_THRESHALPHA LINE_TYPE)rHrIrJrKrLrMrNrOrPrQZ FCST_UNITSrRrSZ OBS_UNITSrTrUrVrWrXrYrZr[r\r])float) met_versionZstat_file_base_columnsr4r4r5get_stat_file_base_columnss r`cCst|}|dkr|dkrgd}|S|dkr |dkrgd}|S|dkr8|dkr.gd}|S|d kr6gd }|S|d krF|dkrDgd }|S|d kra|d krTgd}|S|d|td|S|dkrm|dkrmgd}|S)a! Get the MET .stat file columns for line type based on version number Args: met_version - string of MET version number being used to run stat_analysis line_type - string of the line type of the MET .stat file being read Returns: stat_file_line_type_columns - list of the line type columns SL1L2g@)TOTALFBAROBARFOBARFFBAROOBARMAESAL1L2)rbFABAROABARFOABARFFABAROOABARrhVL1L2gffffff@)rbUFBARVFBARUOBARVOBARUVFOBARUVFFBARUVOOBARg@) rbrprqrrrsrtrurvZ F_SPEED_BARZ O_SPEED_BARVAL1L2)rbUFABARVFABARUOABARVOABARUVFOABARUVFFABARUVOOABARVCNT)7rbrcZFBAR_NCLZFBAR_NCUrdZOBAR_NCLZOBAR_NCUFS_RMSZ FS_RMS_NCLZ FS_RMS_NCUOS_RMSZ OS_RMS_NCLZ OS_RMS_NCUMSVEZMSVE_NCLZMSVE_NCURMSVEZ RMSVE_NCLZ RMSVE_NCUFSTDEVZ FSTDEV_NCLZ FSTDEV_NCUOSTDEVZ OSTDEV_NCLZ OSTDEV_NCUFDIRZFDIR_NCLZFDIR_NCUODIRZODIR_NCLZODIR_NCU FBAR_SPEEDZFBAR_SPEED_NCLZFBAR_SPEED_NCU OBAR_SPEEDZOBAR_SPEED_NCLZOBAR_SPEED_NCU VDIFF_SPEEDZVDIFF_SPEED_NCLZVDIFF_SPEED_NCU VDIFF_DIRZ VDIFF_DIR_NCLZ VDIFF_DIR_NCU SPEED_ERRZ SPEED_ERR_NCLZ SPEED_ERR_NCUZ SPEED_ABSERRZSPEED_ABSERR_NCLZSPEED_ABSERR_NCUDIR_ERRZ DIR_ERR_NCLZ DIR_ERR_NCUZ DIR_ABSERRZDIR_ABSERR_NCLZDIR_ABSERR_NCUz%VCNT is not a valid LINE_TYPE in METVrCTC)rbFY_OYFY_ONFN_OYFN_ON)r^errorexit)loggerr_ line_typeZstat_file_line_type_columnsr4r4r5get_stat_file_line_type_columnss>2-(#rcs>tt|t|krtt|}t|}n t|}dt|}|dkr8||d}||d}n|dkrH||d}||d}|dkr[t|dd}t|dd}nt|dd}t|dd}d}|d |dtjfd d t|Dtd }tj|dd td d}t|d d d|}|S)a! Get contour levels for plotting differences or bias (centered on 0) Args: data - array of data to be contoured spacing - float for spacing for power function, value of 1.0 gives evenly spaced contour intervals Returns: clevels - array of contour levels rdg? g?rrg?cs g|] }d|qS)rr4).0idxspacingspanr4r5 xs zget_clevels..dtypeN) r!absnanminnanmaxroundr"rr^r)datarZcmaxZcminZstepsposnegZclevelsr4rr5 get_clevelsWs2       rc CsLt|dddf}|dkr.tt|dddfD]}tj||ddf||<q|S|dkr`tt|dddfD]}|tj||ddftj||ddf||<q>|S|dkr|jd}| d dg}|j d|_ t ||||\} } } tt| dddfD]}| |||<q|S|d td|S) a! Calculate average of dataset Args: logger - logging file average_method - string of the method to use to calculate the average stat - string of the statistic the average is being taken for model_dataframe - dataframe of model .stat columns model_stat_values - array of statistic values Returns: average_array - array of average value(s) NrMEANZMEDIANZ AGGREGATIONmodel_plot_namesumrz?Invalid entry for MEAN_METHOD, use MEAN, MEDIAN, or AGGREGATION)r! empty_likerrmameaninfomedianshapegroupbyaggcolumnsZ droplevelcalculate_statrr) raverage_methodstatZmodel_dataframeZmodel_stat_valuesZ average_arraylndaysZmodel_dataframe_aggsumZ avg_valuesZ avg_arraystat_plot_namer4r4r5calculate_average~s0   rc% Cs|dkrw||}|tj|} |} t|| d} | dkr0d| t| d} | S| dkrE| dkrEd| t| d} | S| dkrZ| dkrZd | t| d} | S| dkro| d krod | t| d} | S| d krud } | S|d krd\} }g}|jjD] }||dqt|} t j j dgtj d|dt d|ggdd}t j j dgtj d|dt d|ggdd}t jtj||jd}t jtj||jd}t|j}t|| |g}t|| |g}t|dd k}t|dd k}|j|dddf||d |dddf<|j|dddf||d |dddf<|j|dddf||d |dddf<|j|dddf||d |dddf<d} | |kr|| dddddf|jd| f<|| dddddf|jd| f<| d7} | |ksZtj} t|||\}}}t|||\}}}t|||||d d ddddf}t|||||d d ddddf} | |}!t|!|}"t|!|"d}#t|#|d}$d|$} | S|dtd| S)a! Calculate confidence intervals between two sets of data Args: logger - logging file ci_method - string of the method to use to calculate the confidence intervals modelB_values - array of values modelA_values - array of values total_days - float of total number of days being considered, sample size stat - string of the statistic the confidence intervals are being calculated for average_method - string of the method to use to calculate the average randx - 2D array of random numbers [0,1) Returns: intvl - float of the confidence interval ZEMCrPg\(\?r(g@gtV@rgm@z--ZEMC_MONTE_CARLO)ri'Zrand1r)rntestr1)namesZrand2)indexrg?Nz:Invalid entry for MAKE_CI_METHOD, use EMC, EMC_MONTE_CARLO)r!r count_maskedrsqrtrvaluesrrpdZ MultiIndexZ from_productr$rZ DataFramenanremptywhereZiloclocrrrrr)%r ci_methodZ modelB_valuesZ modelA_valuesZ total_daysrrZrandxZmodelB_modelA_diffrZmodelB_modelA_diff_meanZmodelB_modelA_stdZintvlrZntestsr1Zidx_valZrand1_data_indexZrand2_data_indexZ rand1_dataZ rand2_dataZncolumnsZrand1_data_valuesZrand2_data_valuesZ randx_ge0_idxZ randx_lt0_idxZrand1_stat_valuesZrand1_stat_values_arrayrZrand2_stat_valuesZrand2_stat_values_arrayZrand1_average_arrayZrand2_average_arrayZ scores_diffZscores_diff_meanZscores_diff_varZscores_diff_stdr4r4r5 calculate_cisMKIGE    $$      rcCs|dkrd}|S|dkrd}|S|dkrd}|S|dkr d}|S|d kr(d }|S|d kr0d }|S|d kr8d}|S|dkr@d}|S|dkrHd}|S|dkrPd}|S|dkrXd}|S|dkr`d}|S|dkrhd}|S|dkrpd}|S|dkrxd}|S|dkrd }|S|d!krd"}|S|d#krd$}|S|d%krd&}|S|d'krd(}|S|d)krd*}|S|d+krd,}|S|d-krd.}|S|d/krd0}|S|d1krd2}|S|d3krd4}|S|d5krd6}|S|d7krd8}|S|d9krd:}|S|d;krd<}|S|d=krd>}|S|d?krd@}|S|dAkr dB}|S|dCkrdD}|S|dEkrdF}|S|dGkr%dH}|S|dIkr.dJ}|S|dKkr7dL}|S|dMkr@dN}|S|dOkrIdP}|S||dQtdR|S)SaT! Get the formalized name of the statistic being plotted Args: stat - string of the simple statistic name being plotted Returns: stat_plot_name - string of the formal statistic name being plotted biasBiasrmseRoot Mean Square Errormsess&Murphy's Mean Square Error Skill ScorersdRatio of Standard Deviationrmse_md&Root Mean Square Error from Mean Errorrmse_pv-Root Mean Square Error from Pattern VariationpcorPattern CorrelationaccAnomaly Correlation CoefficientfbarForecast Averages fbar_obar!Forecast and Observation Averages speed_err5Difference in Average FCST and OBS Wind Vector Speedsdir_err8Difference in Average FCST and OBS Wind Vector Directionrmsve(Root Mean Square Difference Vector Error vdiff_speedDifference Vector Speed vdiff_dirDifference Vector Directionfbar_obar_speedAverage Wind Vector Speed fbar_obar_dirAverage Wind Vector Direction fbar_speed"Average Forecast Wind Vector Speedfbar_dir&Average Forecast Wind Vector DirectionorateObservation Ratebaser Base Ratefrate Forecast Rate orate_frateObservation and Forecast Rates baser_frateBase and Forecast RatesaccuracyAccuracyfbiasFrequency BiaspodProbability of DetectionhrateHit RatepofdProbability of False DetectionfarateFalse Alarm Ratepodn)Probability of Detection of the Non-EventfaratioFalse Alarm RatiocsiCritical Success Indexts Threat ScoregssGilbert Skill ScoreetsEquitable Threat ScorehkHanssen-Kuipers DiscriminanttssTrue Skill ScorepssPeirce Skill ScorehssHeidke Skill Score is not a valid optionr)rr)rrrr4r4r5get_stat_plot_names VTRPNLJHFDB>:86420.,*(&$"          r cAs|jjdgkrG|dd}|dks|dks|dkr<|jdddg}|jddd}|jddd}n|jddd}ntfdd d Drd }|jddd }|jddd }|jddd} |jddd} |jddd} ntfdd dDrd}|jddd} |jddd} |jddd}|jddd}|jddd}ntfdd dDrd}|jddd}|jddd}|jddd}|jddd}|jddd }|jddd!}|jddd"}n2tfd#d d$Dr\d%}|jddd&}|jddd'}|jddd(}|jddd)}|jddd*}|jddd+}|jddd,}ntfd-d d.Drd/}|jddd }|jddd }|jddd0}|jddd1} |jddd2}!|jddd3}"|jddd4}#|jddd5}$|jddd6}%|jddd7}&|jddd8}'|jddd9}(|jddd:})|jddd;}*|jddd<}+|jddd=},nEtfd>d d?Dr7d@}|jddd}-|jdddA}.|jdddB}/|jdddC}0|jdddD}1n |dEtdF|dGkr}dH}2|d krR||}n+|dkrct |t |}n|d/krn||}n|d@kr{|.|/|.|0}n|dIkrdJ}2|d krt | | dK| }n|dkrt ||dK|}n|dLkrdM}2|d kr| | dK| }3| ||}4dF|3|4}n|dkr||dK|}3|||||}4dF|3|4}n|dNkr=dO}2|d kr | ||}5| ||}4t |5t |4}np|dkr2|||||}5|||||}4t |5t |4}nK|d/kr;|#|$}n@|dPkrjdQ}2|d krTt ||dK}n)|dkrht ||dK||dK}n|dRkrdS}2|d kr| |dK}5| |dK}4| ||t |5|4}6t |5|4dKt |5|4|6}n|dkr|||||}5|||||}4|||||t |5|4}6t |5|4dKt |5|4|6}n|dTkr4dU}2|d kr| ||}5| ||}4| ||t |5|4}nu|dkr2|||||}5|||||}4|||||t |5|4}nI|dVkrgdW}2|dkrW|| | t || | || | }n&|d%kre|t ||}n|dXkrdY}2|d krw|}n|dkrt |}n|d/kr|}n|dkrdZ}2|d kr|jddd d g}|jddd }|jddd }n|dkr|jddd!d"g}t |jddd!}t |jddd"}n|d/kr|jddd d g}|jddd }|jddd }nx|d[krd\}2|d/kr|+}nh|d]kr%d^}2|d/kr#|,}nX|d_kr5d`}2|d/kr3|"}nH|dakrEdb}2|d/krC|)}n8|dckrUdd}2|d/krS|*}n(|dekrldf}2|d/krj|jdddg}n|dhkrdi}2|d/kr|jdddj}n|dkkrdl}2|d/kr|'}n|dmkrdn}2|d/kr|%}n|doks|dpkr|dokrdq}2n|dpkrdr}2|d@kr|.|0|-}n|dskrdt}2|d@kr|.|/|-}n|dks|dkr|dkrdu}2n|dkrdv}2|d@kr|.|/|-}|.|0|-}t j ||gdFdw}nk|dxkr&dy}2|d@kr$|.|1|-}nW|dzkrtj|j|;|>}tj|j|;|>}=nY|:dkr[t|j d};t|j dF}>t|j dK}?tj|j|;|>|?}t|j dK}?tj|j|;|>|?}=tj|<|=g}@nq|:dFkrt|j d};tj|jdF|;}@nV|:dKkrt|j d};t|j dF}>tj|jdF|;|>}@n0|:dkrt|j d};t|j dF}>t|j dK}?tj|jdF|;|>|?}@||@|2fS)a! Calculate the statistic from the data from the read in MET .stat file(s) Args: model_data - Dataframe containing the model(s) information from the MET .stat files stat - string of the simple statistic name being plotted Returns: stat_values - Dataframe of the statistic values stat_values_array - array of the statistic values stat_plot_name - string of the formal statistic name being plotted rbzEmpty model_data dataframeZNULLrrrNc3|]}|vVqdSNr4relemZmodel_data_columnsr4r5 z!calculate_stat..)rcrdrhrarcrdrerfrgc3r!r"r4r#r%r4r5r&r')rjrkrhrirjrkrlrmrnc3r!r"r4r#r%r4r5r&r')rprqrorprqrrrsrtrurvc3r!r"r4r#r%r4r5r&r')rxryrwrxryrzr{r|r}r~c3r!r"r4r#r%r4r5r&r')rrrrrrrrrrrrrrrrrc3r!r"r4r#r%r4r5r&r')rrrrrrrz*Could not recognize line type from columnsrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr)rrrr)rrrrrrrrrrrrrr)axisrrrrrrrrrr rr r r r rrrrrrrrrrrrrrrrrrr)rrtolistZwarningrr#rrr!rrconcatrZnlevelsrZget_level_valuesuniquermasked_invalidreshaper")ArZ model_datarrZ stat_valuesZstat_values_fbarZstat_values_obarrZobarZfobarZffbarZoobarZfabarZoabarZfoabarZffabarZooabarZufbarZvfbarZuobarZvobarZuvfobarZuvffbarZuvoobarZufabarZvfabarZuoabarZvoabarZuvfoabarZuvffabarZuvooabarZfs_rmsZos_rmsZmsverZfstdevZostdevZfdirZodirrZ obar_speedrrrrtotalZfy_oyZfy_onZfn_oyZfn_onrZmseZvar_oZvar_fRCZCaZCbZnindexZindex0Zstat_values_array_fbarZstat_values_array_obarZindex1index2Zstat_values_arrayr4r%r5r}s&                                          ( "$                                                                                           rcCs||dtj|dd}d||vr#|d|d}|d7}nd|vr4|d|d}|d7}tj|d|}|S) Nr_dump_row.statr8 fcst_leadfcst_lead_avgs.txtz_fcst_lead_avgs.txtrospathbasenamerEjoin)rinput_filenamer4output_base_dirZlead_avg_filenameZ lead_avg_filer4r4r5get_lead_avg_files$    r>cCs|dtj|dd}d||vr|d|d}nd|vr0|d|d}|d7}|d|d7}tj|d |}|S) Nrr3r8r4r5Z_fcst_lead_avgsZ_CI_r6rr7)rr<r4r=rZ CI_filenameZCI_filer4r4r5 get_ci_files"   r?)r8rr&numpyr!pandasrwarningsfilterwarningsr6rGr`rrrrr rr>r?r4r4r4r5s. 9,!E'-nd