검색
검색 팝업 닫기

Ex) Article Title, Author, Keywords

## Original Article

J Environ Health Sci. 2022; 48(5): 266-271

Published online October 31, 2022 https://doi.org/10.5668/JEHS.2022.48.5.266

## Differences by Selection Method for Exposure Factor Input Distribution for Use in Probabilistic Consumer Exposure Assessment

Sohyun Kang1 , Jinho Kim1, Miyoung Lim2 , Kiyoung Lee1,2*

1Department of Environmental Health Sciences, Graduate School of Public Health, Seoul National University,
2Institute of Health and Environment, Seoul National University

Correspondence to:Department of Environmental Health Sciences, Graduate School of Public Health, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea
Tel: +82-2-880-2735
Fax: +82-2-762-2888
E-mail: cleanair@snu.ac.kr

Received: August 23, 2022; Revised: October 14, 2022; Accepted: October 17, 2022

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

### Highlights

ㆍ There is no guidelines for Goodness-of-fit (GOF) in model selection for probabilistic exposure assessment.
ㆍ We compared outcomes of the exposure models for consumer products using five GOF tests.
ㆍ Adnerson-Darling’s and Kolmogorov-Smirnov tests were similar to individual exposure estimations.
ㆍ Selection of GOF could cause significant diffences in exposure assessment outcomes.

### Graphical Abstract

Background: The selection of distributions of input parameters is an important component in probabilistic exposure assessment. Goodness-of-fit (GOF) methods are used to determine the distribution of exposure factors. However, there are no clear guidelines for choosing an appropriate GOF method.
Objectives: The outcomes of probabilistic consumer exposure assessment were compared by using five different GOF methods for the selection of input distributions: chi-squared test, Kolmogorov-Smirnov test (KS), Anderson-Darling test (A-D), Akaike information criterion (AIC) and Bayesian information criterion (BIC).
Methods: Individual exposures were estimated based on product usage factor combinations from 10,000 respondents. The distribution of individual exposure was considered as the true value of population exposures.
Results: Among the five GOF methods, probabilistic exposure distributions using the A-D and K-S methods were similar to individual exposure estimations. Comparing the 95th percentiles of the probabilistic distributions and the individual estimations for 10 CPs, there were 0.73 to 1.92 times differences for the A-D method, and 0.73 to 1.60 times differences (excluding tire-shine spray) for the K-S method.
Conclusions: There were significant differences in exposure assessment results among the selection of the GOF methods. Therefore, the GOF methods for probabilistic consumer exposure assessment should be carefully selected.

KeywordsProbabilistic exposure assessment, consumer products, distribution of exposure factors, goodness-of-fit test

Probabilistic exposure assessment can be used for exposure and risk assessment associated with consumer products (CPs). It is important to select input variables carefully to obtain good outcomes. The first step of a probabilistic exposure assessment is to characterize input parameters as distribution functions. Among the various methods used to assess distribution, testing the fit of a theoretical distribution to an empirical data set is a typical statistical approach.1) Goodness-of-fit (GOF) tests can be used to obtain exposure factor distributions by selecting probabilistic models and estimating the related parameters.2) In the second step, the distribution is combined with a mathematical simulation (e.g., using the Monte Carlo method) to obtain exposure assessment results.

There are many statistical approaches for testing GOF, including the chi-squared test, Kolmogorov-Smirnov (K-S) test, Anderson-Darling (A-D) test, Akaike information criterion (AIC), and Bayesian information criterion (BIC). The chi-squared test is a general test based on differences between the squares of the observed and expected frequencies that can be used to test any distribution.1) The K-S test is a nonparametric test that compares the maximum absolute difference between the empirical cumulative distribution function (CDF) and the theoretical CDF.1) The A-D test analyses the weighted square of the difference between the empirical and fitted CDFs.2) The AIC and BIC are statistical methods using the likelihood functions based on information loss. When selecting the GOF method for the probabilistic exposure assessment, researchers should consider the statistical characteristics of each method. The chi-squared test is very sensitive to the number and interval width of bins3) and requires the bins to be set properly. The AIC- and BIC-based methods tend to select distributions with minimal information loss. The BIC is preferred over the AIC when the sample size is much larger than number of parameters.4)

There are no clear guidelines for selecting the most appropriate statistical approaches of GOF test.3) US EPA recommended decisions of the distribution function depending on the number of data point, the outcome of interest, and the tail of distribution.5) Gilsenan, Lambe, and Gibney6) used A-D tests for food chemical exposure assessment because it’s tended to focus on distribution tails. In statistical, ecological, and epidemiological research, GOF methods were selected based on what previous related studies have used.7,8) Little research has been conducted on whether GOF test methods accurately recover the original input parameters. Nonetheless, Chiew et al.4) created mathematically ideal random distributions and compared the distributions derived from several GOF test methods. They found that the performance of statistical methods varied depending on the data characteristics. Other studies have used several GOF tests simultaneously to account for the limitations of each method.9-11)

Selecting the most appropriate statistical approaches of GOF test could be determined by statistical characteristics of input parameters. In probabilistic exposure assessment of CPs, the difference due to the use of different GOF methods was not evaluated. In consumer exposure assessments, the input data collected from surveys tended to be right-skewed distribution with a long right tail.12,13) The average values of exposure factors, such as use frequency and use amount, were larger than the median values. There were few respondents with very high usage pattern values in the survey. These input characteristics might have an effect on the probabilistic exposure assessment results according to the GOF methods.

The aim of this study was to evaluate the outcomes of probabilistic exposure assessments of CPs using different GOF methods. The probabilistic exposure assessment was conducted using exposure factor data for 10 CPs from national scale exposure factor database of 10,000 participants. For each GOF method, the probabilistic exposure distribution was compared with individual exposure estimates.

### 1. Data on CPs

Exposure factor data for 18 CPs, which included cleaning products, automotive care products, and surface protection products, were collected from 10,000 participants. The exposure factor data were obtained through face-to-face interview from 17 metropolitan areas and provinces in Korea. The surveyed population consisted of individuals over 15 years old considering gender ratio. Surveyed exposure data were usage information such as application type (spray, aerosol, liquid), frequency of use (7 days, 1 month, 6 months, 1 year), application number of CPs, the average time of CPs per application and the amount use of CPs per application.13) This study targeted 10 CPs which have 300 or more participants responded that they used among 18 CPs. The 10 CPs were mold-stain spray, anti-fogging spray, tire-shine spray, car-coating spray, glass-cleaning spray, rain-repelling spray, liquid household bleach, liquid washing-machine cleaner, liquid drain cleaner, and rust-inhibiting spray. The number of users of each CP ranged from 305 to 2,315. Liquid household bleach had the most users (n= 2,315) and anti-fogging spray had the fewest (n=305).

### 2. Exposure estimation

This study assessed daily inhalation exposure to the 10 CPs based on product usage. The National Institute of Environmental Research in Korea (KNIER) has developed exposure algorithms specific for various product-exposure scenarios. Thus, daily inhalation exposure was estimated using an appropriate KNIER equation.14) Exposure to all 10 CPs assessed in this study occurred through volatilization. It was assumed that mold-stain spray and the liquid household bleach were used in the bathroom and that liquid-washing machine cleaner and the liquid drain cleaner were used indoors. All other CPs were assumed to be used outdoors.

Thus, volatilization exposure was assessed for each CP using the following equation:

Dinh=Ap×Wf×F×1exp N×t×IR×abs×nV×N×BW

where Dinh is daily exposure via inhalation (mg/kg/day), Ap is the use amount of CP (mg/event), Wf is the fraction of a specific chemical in the product (unitless), F is the emission rate (unitless), N is the ventilation rate (events/h), IR is the inhalation rate (m3/h), abs is the absorption rate (unitless), t is the use duration of CP (h/event), n is the use frequency of CP (event/day), V is the volume of space used (m3), and BW is body weight (kg).

Among the equation input parameters used for each CP, Ap, t and n used exposure factors as inputs. The values of Wf, abs, and F were assumed to be 1. N was assumed to be 2 h–1 for bathroom and 0.6 h–1 for outdoor and other indoors use. BW was assumed to be 64.2 kg, which was the mean weight of the Korean adult. V was assumed to be 9.3 m3 for bathroom and 33.3 m3 for outdoor and other indoor spaces.14) IR was assumed to be 0.6 m3/h, a mean inhalation value for the Korean adult reported in the Korean exposure factors handbook.15)

### 3. Comparison of individual exposure estimates and probabilistic exposure distributions

Each individual exposure estimate was compared with the distribution derived from probabilistic exposure assessment. The comparison process was illustrated graphically through Fig. 1. Individual exposure values were calculated to estimate real-life CP exposures. To obtain individual exposure values, we used responses to survey questions about personal exposure. Individual exposure values were calculated by inputting each individual exposure parameter into the exposure algorithm. The values were then combined to obtain a distribution of the parent population exposure. Statistical values (50th and 95th percentiles) associated with calculated exposure estimates were compared with those associated with the probabilistic exposure distribution.

Figure 1.Schematic of the process for comparison of individual exposure estimates and probabilistic exposure distributions

Thereafter, probabilistic exposure assessments for the CPs were performed. The distributions of three input parameters (use frequency, use amount and use duration) were obtained using five GOF methods―the chi-squared test, K-S test, A-D test, AIC, and BIC. The distributions of input parameters put into the exposure algorithm using Monte Carlo simulation (10,000 iterations) to obtain a probabilistic exposure distribution. All computational simulations for probabilistic exposure assessment were performed using @RISK 7.5 (Palisade Corporation, Ithaca, NY, USA), and statistics (50th and 95th percentiles) related to the probabilistic exposure distributions were calculated.

To compare the performance among GOF methods, the ratio of the individual exposure estimate to the probabilistic exposure estimate was calculated. The statistical values for each probabilistic exposure estimate obtained using each GOF method were contrasted with those for individual exposure estimates. With this comparison, the GOF methods used for probabilistic exposure assessment were evaluated.

### III. Results and Discussion

Individual exposure was calculated by applying the exposure factors derived from the survey to relevant exposure scenarios and using an appropriate algorithm. The exposure distribution of each CP in the parent population was estimated by combining individual exposure (Table 1). For respondents at the 95th-percentile exposure level, liquid household bleach was associated with the highest exposure estimate (9.4425 mg/kg/day), followed by liquid drain cleaner (5.5644 mg/kg/day), and mold-stain spray (0.7314 mg/kg/day). A similar trend was observed for the 50th-percentile. Conversely, anti-fogging spray was associated with the lowest exposure estimate (0.0112 mg/kg/day) for respondents at the 95th-percentile exposure level, followed by rain-repelling spray (0.0189 mg/kg/day) and rust-inhibiting spray (0.0198 mg/kg/day). For respondents at the 50th-percentile exposure level, estimates associated with these three CPs were still small.

Statistics of individual exposure estimates for consumer products

ProductsNIndividual inhalation exposure (mg/kg/day)

MeanSD*50th75th95th99th
Mold-stain spray9240.17940.41040.04970.11490.73142.7926
Anti-fogging spray3050.00420.02190.00150.00390.01120.0194
Tire-shine spray3150.01050.02070.00490.01070.04150.0869
Car-coating spray6430.02720.05960.00770.02520.10790.3007
Glass-cleaning spray3500.01400.02420.00660.01580.05110.1202
Rain repelling spray3840.00460.00950.00170.00430.01890.0603
Liquid household bleach2,3152.53164.23991.10022.85069.442518.8850
Liquid washing-machine cleaner4420.01200.03040.00220.00850.05320.1597
Liquid drain cleaner1,8431.49514.15520.51041.37535.564414.1038
Rust-inhibiting spray3710.00400.01400.00030.00240.01980.0584

*SD, standard deviation.

These individual exposure estimates were considered to be true values in comparisons with the results from probabilistic exposure assessment. This was because the questionnaire data were collected from an overall sample population that reflected the gender and regional distribution of the national population. Additionally, each CP was associated with a sample size of 300 or more; thus, the sample population for each CP was large enough to be national representative population. It was considered that the individual exposure results reflected actual individual exposure in the Korean population. Lim et al.16) used similar method to compare distributions associated with product-based and receptor-based aggregate exposure doses in their CP exposure assessment.

The ratios of individual to probabilistic exposure at 95th-percentile exposure levels were calculated (Table 2). Based on the K-S test, two products had ratios of 0.9~1.1 at 95th-percentile exposure levels, and four products had ratios of 0.8~1.2. Based on the A-D test, three products had ratios of 0.9~1.1 at 95th-percentile exposure levels, and four products had ratios of 0.8~1.2. The exposure ratios varied greatly among GOF methods for CPs such as liquid washing-machine cleaner and liquid drain cleaner. By contrast, the variation in exposure ratios was small for anti-fogging spray and car-coating spray.

Ratios of probabilistic exposure estimates to individual exposure estimates using 50th- and 95th-percentile values

Products50th-percentile ratio95th-percentile ratio

Chi-squaredK-SA-DAICBICChi-squaredK-SA-DAICBIC
Mold-stain spray2.490.980.880.660.673.250.850.822.532.84
Anti-fogging spray0.710.790.850.610.601.661.601.662.052.09
Tire-shine spray1.601.220.810.420.4220.336.441.071.571.66
Car-coating spray0.851.081.100.850.881.191.020.951.231.26
Glass-cleaning spray1.730.840.860.780.7832.101.181.217.166.87
Rain-repelling spray1.291.021.030.500.5316.951.271.293.713.50
Liquid household bleach2.240.920.840.710.723.081.091.051.301.24
Liquid washing-machine cleaner1.101.000.970.710.701.371.601.9221.0321.22
Liquid drain cleaner1.370.910.910.690.7025.471.411.6749.5947.63
Rust-inhibiting spray1.050.770.770.610.610.880.730.735.304.58

K-S: Kolmogorov-Smirnov test, A-D: Anderson-Darling test, AIC: Akaike information criterion, BIC: Bayesian information criterion.

The chi-squared tests often estimated larger 95th-percentile exposure ratios than other GOF methods. The 95th-percentile ratios derived from the K-S and A-D tests were smaller than those derived from other methods for most of CPs. The 95th-percentile ratios estimated using AIC and BIC test were similar. Trends associated with 50th-percentile ratios were comparable with those of 95th-percentile ratios. However, in contrast to the results obtained for 95th-percentile values, the AIC and BIC methods tended to produce smaller 50th-percentile values than the other GOF methods. In general, 95th-percentile ratios estimated using the GOF methods were greater than 1. As such, the 95th-percentile values estimated using the GOF methods were mostly greater than the actual 95th-percentile value. For the 50th-percentile ratios, fewer had values greater than 1 compared to the 95th-percentile ratios.

Exposure-estimate ratios calculated using the K-S and A-D tests were closer to 1 than those calculated using other GOF methods. The K-S test is most sensitive around the median of a distribution,2) whereas the A-D test emphasizes the fit to the tails of a distribution.1) Therefore, K-S and A-D methods appear to be more suitable for estimating probabilistic exposure than other GOF methods. However, some methods often underestimated the 95th-percentile exposure for some CPs (e.g., mold-stain spray, car-coating spray, rust-inhibiting spray), producing values of less than 1. Underestimating the 95th-percentile values can lead to errors in risk assessments based on reasonable maximum exposures. According to the US EPA, when using a GOF method to obtain a distribution, the distribution function may vary depending on the number and the tail of the distribution of parent data.5) The number of raw data in the Exposure factor used in this study varied from 305 to 2,315, and the skewness of the distribution was also different.13) The difference in the appropriateness of the results may have originated from this difference in the parent distribution.

The results indicated that the use of different GOF methods can lead to significantly different outcome; i.e., an appropriate GOF method should be carefully selected for probabilistic exposure assessment. However, neither GOF method was the most appropriate in all cases, and the selection should be case-dependent. This study was the first study to compare the results of probabilistic exposure assessment according to the selection of the GOF methods. It was significant of this study to show that there is a significant difference in results depending on the selection of GOF method. Therefore, further research is needed to understand why these GOF methods produce different results.

The values calculated using the exposure algorithm did not represent actual internal exposure, because the fraction of a specific chemical in the products, inhalation rate, and emission rate were assumed to be 1. This was done to exclude variation due to chemical type. Therefore, we focused on outcomes of using different GOF methods to estimate distributions of exposure factors related to product-usage patterns in this study. How different GOF methods affect the accuracy of internal exposure estimation via probabilistic exposure assessment should be investigated in a separate study.

This study conducted the probabilistic exposure assessment using five GOF methods for 10 CPs which have the number of data more than 300. The results of the probabilistic exposure assessments were compared, and there was a significant difference according to the selection of GOF. Depending on the GOF method used, the 95th-percentile of the probabilistic exposure distribution differed up to 49.5 times from the true value. This difference may come from the number and the shape of parent distribution data. This study does not suggest that any GOF method is the best method for probabilistic exposure assessment of CPs. However, exposure scientists should carefully choose the GOF method when conducting probabilistic exposure assessments.

This study was partially supported by the Basic Science Research Program through the NRF funded by the Ministry of Education, Science and Technology (NRF-2019R1A2C1083938).

Miyoung Lim (Research Professor), Kiyoung Lee (Professor)

### Conflict of Interest

1. U.S. Environmental Protection Agency. Risk Assessment Guidance for Superfund: Volume III - Part A, Process for Conducting Probabilistic Risk Assessment. Washington, DC: EPA; 2001. https://www.epa.gov/sites/default/files/2015-09/documents/rags3adt_complete.pdf
2. Cullen AC, Frey HC. Probabilistic Techniques in Exposure Assessment: A Handbook for Dealing with Variability and Uncertainty in Models and Inputs. New York: Plenum Press; 1999. https://www.worldcat.org/ko/title/39951660
3. U.S. Environmental Protection Agency. Guiding Principles for Monte Carlo Analysis. Washington, DC: EPA; 1997. https://www.epa.gov/sites/default/files/2014-11/documents/montecar.pdf
4. Chiew E, Cauthen K, Brown N, Nozick L. Comparison of distribution selection methods. Commun Stat Simul Comput. 2022; 51(4): 1982-2005. https://doi.org/10.1080/03610918.2019.1691227
5. U.S. Environmental Protection Agency. Report of the Workshop on Selecting Input Distributions for Probabilistic Assessments. Washington, DC: EPA; 1999. https://nepis.epa.gov/Exe/ZyNET.exe/30004ZPJ.TXT?ZyActionD=ZyDocument&Client=EPA&Index=1995+Thru+1999&Docs=&Query=&Time=&EndTime=&SearchMethod=1&TocRestrict=n&Toc=&TocEntry=&QField=&QFieldYear=&QFieldMonth=&QFieldDay=&IntQFieldOp=0&ExtQFieldOp=0&XmlQuery=&File=D%3A%5Czyfiles%5CIndex Data%5C95thru99%5CTxt%5C00000013%5C30004ZPJ.txt&User=ANONYMOUS&Password=anonymous&SortMethod=h%7C-&MaximumDocuments=1&FuzzyDegree=0&ImageQuality=r75g8/r75g8/x150y150g16/i425&Display=hpfr&DefSeekPage=x&SearchBack=ZyActionL&Back=ZyActionS&BackDesc=Results page&MaximumPages=1&ZyEntry=1&SeekPage=x&ZyPURL
6. Gilsenan MB, Lambe J, Gibney MJ. Assessment of food intake input distributions for use in probabilistic exposure assessments of food additives. Food Addit Contam. 2003; 20(11): 1023-1033.
7. Cai T, Xia Y, Zhou Y. Generalized inflated discrete models: a strategy to work with multimodal discrete distributions. Sociol Methods Res. 2021; 50(1): 365-400. https://doi.org/10.1177/0049124118782535
8. Muoka AK, Ngesa OO, Waititu AG. Statistical models for count data. Sci J Appl Math Stat. 2016; 4(6): 256-262. https://www.sciencepublishinggroup.com/journal/paperinfo?journalid=149&doi=10.11648/j.sjams.20160406.12
9. Böhning D. Zero-inflated poisson models and C.A.MAN: a tutorial collection of evidence. Biom J. 1998; 40(7): 833-843. https://doi.org/10.1002/(SICI)1521-4036(199811)40:7%3c833::AID-BIMJ833%3e3.0.CO;2-O
10. Karlis D, Ntzoufras I. Analysis of sports data by using bivariate Poisson models. J R Stat Soc Ser D Stat. 2003; 52(3): 381-393. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9884.00366
11. Olanrewaju RO. Integer-valued time series model via generalized linear models technique of estimation. Int Ann Sci. 2018; 4(1): 35-43. https://journals.aijr.org/index.php/ias/article/view/455
12. Dimitroulopoulou C, Lucica E, Johnson A, Ashmore MR, Sakellaris I, Stranger M, et al. EPHECT I: European household survey on domestic use of consumer products and development of worst-case scenarios for daily use. Sci Total Environ. 2015; 536: 880-889.
13. Park JY, Lim M, Yang W, Lee K. Exposure factors for cleaning, automotive care, and surface protection products for exposure assessments. Food Chem Toxicol. 2017; 99: 128-134.
14. National Institute of Environmental Research. Regulation of Purpose and Method of Concerned Products Risk Assessment. Incheon: National Institute of Environmental Research; 2016. https://www.law.go.kr/admRulLsInfoP.do?admRulSeq=2100000071414
15. National Institute of Environmental Research. Korean Exposure Factors Handbook. Incheon: National Institute of Environmental Research; 2019. https://www.nl.go.kr/NL/contents/search.do?srchTarget=total&pageNum=1&pageSize=10&kwd=%ED%95%9C%EA%B5%AD%EC%9D%B8%EC%9D%98+%EB%85%B8%EC%B6%9C%EA%B3%84%EC%88%98+%ED%95%B8%EB%93%9C%EB%B6%81#viewKey=696876600&viewType=AH1&category=%EB%8F%84%EC%84%9C&pageIdx=1&jourId=
16. Lim M, Park JY, Lim JE, Moon HB, Lee K. Receptor-based aggregate exposure assessment of phthalates based on individual's simultaneous use of multiple cosmetic products. Food Chem Toxicol. 2019; 127: 163-172.

### Article

#### Original Article

J Environ Health Sci. 2022; 48(5): 266-271

Published online October 31, 2022 https://doi.org/10.5668/JEHS.2022.48.5.266

## Differences by Selection Method for Exposure Factor Input Distribution for Use in Probabilistic Consumer Exposure Assessment

Sohyun Kang1 , Jinho Kim1, Miyoung Lim2 , Kiyoung Lee1,2*

1Department of Environmental Health Sciences, Graduate School of Public Health, Seoul National University,
2Institute of Health and Environment, Seoul National University

Correspondence to:Department of Environmental Health Sciences, Graduate School of Public Health, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea
Tel: +82-2-880-2735
Fax: +82-2-762-2888
E-mail: cleanair@snu.ac.kr

Received: August 23, 2022; Revised: October 14, 2022; Accepted: October 17, 2022

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

### Abstract

Background: The selection of distributions of input parameters is an important component in probabilistic exposure assessment. Goodness-of-fit (GOF) methods are used to determine the distribution of exposure factors. However, there are no clear guidelines for choosing an appropriate GOF method.
Objectives: The outcomes of probabilistic consumer exposure assessment were compared by using five different GOF methods for the selection of input distributions: chi-squared test, Kolmogorov-Smirnov test (KS), Anderson-Darling test (A-D), Akaike information criterion (AIC) and Bayesian information criterion (BIC).
Methods: Individual exposures were estimated based on product usage factor combinations from 10,000 respondents. The distribution of individual exposure was considered as the true value of population exposures.
Results: Among the five GOF methods, probabilistic exposure distributions using the A-D and K-S methods were similar to individual exposure estimations. Comparing the 95th percentiles of the probabilistic distributions and the individual estimations for 10 CPs, there were 0.73 to 1.92 times differences for the A-D method, and 0.73 to 1.60 times differences (excluding tire-shine spray) for the K-S method.
Conclusions: There were significant differences in exposure assessment results among the selection of the GOF methods. Therefore, the GOF methods for probabilistic consumer exposure assessment should be carefully selected.

Keywords: Probabilistic exposure assessment, consumer products, distribution of exposure factors, goodness-of-fit test

### I. Introduction

Probabilistic exposure assessment can be used for exposure and risk assessment associated with consumer products (CPs). It is important to select input variables carefully to obtain good outcomes. The first step of a probabilistic exposure assessment is to characterize input parameters as distribution functions. Among the various methods used to assess distribution, testing the fit of a theoretical distribution to an empirical data set is a typical statistical approach.1) Goodness-of-fit (GOF) tests can be used to obtain exposure factor distributions by selecting probabilistic models and estimating the related parameters.2) In the second step, the distribution is combined with a mathematical simulation (e.g., using the Monte Carlo method) to obtain exposure assessment results.

There are many statistical approaches for testing GOF, including the chi-squared test, Kolmogorov-Smirnov (K-S) test, Anderson-Darling (A-D) test, Akaike information criterion (AIC), and Bayesian information criterion (BIC). The chi-squared test is a general test based on differences between the squares of the observed and expected frequencies that can be used to test any distribution.1) The K-S test is a nonparametric test that compares the maximum absolute difference between the empirical cumulative distribution function (CDF) and the theoretical CDF.1) The A-D test analyses the weighted square of the difference between the empirical and fitted CDFs.2) The AIC and BIC are statistical methods using the likelihood functions based on information loss. When selecting the GOF method for the probabilistic exposure assessment, researchers should consider the statistical characteristics of each method. The chi-squared test is very sensitive to the number and interval width of bins3) and requires the bins to be set properly. The AIC- and BIC-based methods tend to select distributions with minimal information loss. The BIC is preferred over the AIC when the sample size is much larger than number of parameters.4)

There are no clear guidelines for selecting the most appropriate statistical approaches of GOF test.3) US EPA recommended decisions of the distribution function depending on the number of data point, the outcome of interest, and the tail of distribution.5) Gilsenan, Lambe, and Gibney6) used A-D tests for food chemical exposure assessment because it’s tended to focus on distribution tails. In statistical, ecological, and epidemiological research, GOF methods were selected based on what previous related studies have used.7,8) Little research has been conducted on whether GOF test methods accurately recover the original input parameters. Nonetheless, Chiew et al.4) created mathematically ideal random distributions and compared the distributions derived from several GOF test methods. They found that the performance of statistical methods varied depending on the data characteristics. Other studies have used several GOF tests simultaneously to account for the limitations of each method.9-11)

Selecting the most appropriate statistical approaches of GOF test could be determined by statistical characteristics of input parameters. In probabilistic exposure assessment of CPs, the difference due to the use of different GOF methods was not evaluated. In consumer exposure assessments, the input data collected from surveys tended to be right-skewed distribution with a long right tail.12,13) The average values of exposure factors, such as use frequency and use amount, were larger than the median values. There were few respondents with very high usage pattern values in the survey. These input characteristics might have an effect on the probabilistic exposure assessment results according to the GOF methods.

The aim of this study was to evaluate the outcomes of probabilistic exposure assessments of CPs using different GOF methods. The probabilistic exposure assessment was conducted using exposure factor data for 10 CPs from national scale exposure factor database of 10,000 participants. For each GOF method, the probabilistic exposure distribution was compared with individual exposure estimates.

### 1. Data on CPs

Exposure factor data for 18 CPs, which included cleaning products, automotive care products, and surface protection products, were collected from 10,000 participants. The exposure factor data were obtained through face-to-face interview from 17 metropolitan areas and provinces in Korea. The surveyed population consisted of individuals over 15 years old considering gender ratio. Surveyed exposure data were usage information such as application type (spray, aerosol, liquid), frequency of use (7 days, 1 month, 6 months, 1 year), application number of CPs, the average time of CPs per application and the amount use of CPs per application.13) This study targeted 10 CPs which have 300 or more participants responded that they used among 18 CPs. The 10 CPs were mold-stain spray, anti-fogging spray, tire-shine spray, car-coating spray, glass-cleaning spray, rain-repelling spray, liquid household bleach, liquid washing-machine cleaner, liquid drain cleaner, and rust-inhibiting spray. The number of users of each CP ranged from 305 to 2,315. Liquid household bleach had the most users (n= 2,315) and anti-fogging spray had the fewest (n=305).

### 2. Exposure estimation

This study assessed daily inhalation exposure to the 10 CPs based on product usage. The National Institute of Environmental Research in Korea (KNIER) has developed exposure algorithms specific for various product-exposure scenarios. Thus, daily inhalation exposure was estimated using an appropriate KNIER equation.14) Exposure to all 10 CPs assessed in this study occurred through volatilization. It was assumed that mold-stain spray and the liquid household bleach were used in the bathroom and that liquid-washing machine cleaner and the liquid drain cleaner were used indoors. All other CPs were assumed to be used outdoors.

Thus, volatilization exposure was assessed for each CP using the following equation:

$Dinh=Ap×Wf×F×1−exp −N×t×IR×abs×nV×N×BW$

where Dinh is daily exposure via inhalation (mg/kg/day), Ap is the use amount of CP (mg/event), Wf is the fraction of a specific chemical in the product (unitless), F is the emission rate (unitless), N is the ventilation rate (events/h), IR is the inhalation rate (m3/h), abs is the absorption rate (unitless), t is the use duration of CP (h/event), n is the use frequency of CP (event/day), V is the volume of space used (m3), and BW is body weight (kg).

Among the equation input parameters used for each CP, Ap, t and n used exposure factors as inputs. The values of Wf, abs, and F were assumed to be 1. N was assumed to be 2 h–1 for bathroom and 0.6 h–1 for outdoor and other indoors use. BW was assumed to be 64.2 kg, which was the mean weight of the Korean adult. V was assumed to be 9.3 m3 for bathroom and 33.3 m3 for outdoor and other indoor spaces.14) IR was assumed to be 0.6 m3/h, a mean inhalation value for the Korean adult reported in the Korean exposure factors handbook.15)

### 3. Comparison of individual exposure estimates and probabilistic exposure distributions

Each individual exposure estimate was compared with the distribution derived from probabilistic exposure assessment. The comparison process was illustrated graphically through Fig. 1. Individual exposure values were calculated to estimate real-life CP exposures. To obtain individual exposure values, we used responses to survey questions about personal exposure. Individual exposure values were calculated by inputting each individual exposure parameter into the exposure algorithm. The values were then combined to obtain a distribution of the parent population exposure. Statistical values (50th and 95th percentiles) associated with calculated exposure estimates were compared with those associated with the probabilistic exposure distribution.

Figure 1. Schematic of the process for comparison of individual exposure estimates and probabilistic exposure distributions

Thereafter, probabilistic exposure assessments for the CPs were performed. The distributions of three input parameters (use frequency, use amount and use duration) were obtained using five GOF methods―the chi-squared test, K-S test, A-D test, AIC, and BIC. The distributions of input parameters put into the exposure algorithm using Monte Carlo simulation (10,000 iterations) to obtain a probabilistic exposure distribution. All computational simulations for probabilistic exposure assessment were performed using @RISK 7.5 (Palisade Corporation, Ithaca, NY, USA), and statistics (50th and 95th percentiles) related to the probabilistic exposure distributions were calculated.

To compare the performance among GOF methods, the ratio of the individual exposure estimate to the probabilistic exposure estimate was calculated. The statistical values for each probabilistic exposure estimate obtained using each GOF method were contrasted with those for individual exposure estimates. With this comparison, the GOF methods used for probabilistic exposure assessment were evaluated.

### III. Results and Discussion

Individual exposure was calculated by applying the exposure factors derived from the survey to relevant exposure scenarios and using an appropriate algorithm. The exposure distribution of each CP in the parent population was estimated by combining individual exposure (Table 1). For respondents at the 95th-percentile exposure level, liquid household bleach was associated with the highest exposure estimate (9.4425 mg/kg/day), followed by liquid drain cleaner (5.5644 mg/kg/day), and mold-stain spray (0.7314 mg/kg/day). A similar trend was observed for the 50th-percentile. Conversely, anti-fogging spray was associated with the lowest exposure estimate (0.0112 mg/kg/day) for respondents at the 95th-percentile exposure level, followed by rain-repelling spray (0.0189 mg/kg/day) and rust-inhibiting spray (0.0198 mg/kg/day). For respondents at the 50th-percentile exposure level, estimates associated with these three CPs were still small.

Statistics of individual exposure estimates for consumer products.

ProductsNIndividual inhalation exposure (mg/kg/day)

MeanSD*50th75th95th99th
Mold-stain spray9240.17940.41040.04970.11490.73142.7926
Anti-fogging spray3050.00420.02190.00150.00390.01120.0194
Tire-shine spray3150.01050.02070.00490.01070.04150.0869
Car-coating spray6430.02720.05960.00770.02520.10790.3007
Glass-cleaning spray3500.01400.02420.00660.01580.05110.1202
Rain repelling spray3840.00460.00950.00170.00430.01890.0603
Liquid household bleach2,3152.53164.23991.10022.85069.442518.8850
Liquid washing-machine cleaner4420.01200.03040.00220.00850.05320.1597
Liquid drain cleaner1,8431.49514.15520.51041.37535.564414.1038
Rust-inhibiting spray3710.00400.01400.00030.00240.01980.0584

*SD, standard deviation..

These individual exposure estimates were considered to be true values in comparisons with the results from probabilistic exposure assessment. This was because the questionnaire data were collected from an overall sample population that reflected the gender and regional distribution of the national population. Additionally, each CP was associated with a sample size of 300 or more; thus, the sample population for each CP was large enough to be national representative population. It was considered that the individual exposure results reflected actual individual exposure in the Korean population. Lim et al.16) used similar method to compare distributions associated with product-based and receptor-based aggregate exposure doses in their CP exposure assessment.

The ratios of individual to probabilistic exposure at 95th-percentile exposure levels were calculated (Table 2). Based on the K-S test, two products had ratios of 0.9~1.1 at 95th-percentile exposure levels, and four products had ratios of 0.8~1.2. Based on the A-D test, three products had ratios of 0.9~1.1 at 95th-percentile exposure levels, and four products had ratios of 0.8~1.2. The exposure ratios varied greatly among GOF methods for CPs such as liquid washing-machine cleaner and liquid drain cleaner. By contrast, the variation in exposure ratios was small for anti-fogging spray and car-coating spray.

Ratios of probabilistic exposure estimates to individual exposure estimates using 50th- and 95th-percentile values.

Products50th-percentile ratio95th-percentile ratio

Chi-squaredK-SA-DAICBICChi-squaredK-SA-DAICBIC
Mold-stain spray2.490.980.880.660.673.250.850.822.532.84
Anti-fogging spray0.710.790.850.610.601.661.601.662.052.09
Tire-shine spray1.601.220.810.420.4220.336.441.071.571.66
Car-coating spray0.851.081.100.850.881.191.020.951.231.26
Glass-cleaning spray1.730.840.860.780.7832.101.181.217.166.87
Rain-repelling spray1.291.021.030.500.5316.951.271.293.713.50
Liquid household bleach2.240.920.840.710.723.081.091.051.301.24
Liquid washing-machine cleaner1.101.000.970.710.701.371.601.9221.0321.22
Liquid drain cleaner1.370.910.910.690.7025.471.411.6749.5947.63
Rust-inhibiting spray1.050.770.770.610.610.880.730.735.304.58

K-S: Kolmogorov-Smirnov test, A-D: Anderson-Darling test, AIC: Akaike information criterion, BIC: Bayesian information criterion..

The chi-squared tests often estimated larger 95th-percentile exposure ratios than other GOF methods. The 95th-percentile ratios derived from the K-S and A-D tests were smaller than those derived from other methods for most of CPs. The 95th-percentile ratios estimated using AIC and BIC test were similar. Trends associated with 50th-percentile ratios were comparable with those of 95th-percentile ratios. However, in contrast to the results obtained for 95th-percentile values, the AIC and BIC methods tended to produce smaller 50th-percentile values than the other GOF methods. In general, 95th-percentile ratios estimated using the GOF methods were greater than 1. As such, the 95th-percentile values estimated using the GOF methods were mostly greater than the actual 95th-percentile value. For the 50th-percentile ratios, fewer had values greater than 1 compared to the 95th-percentile ratios.

Exposure-estimate ratios calculated using the K-S and A-D tests were closer to 1 than those calculated using other GOF methods. The K-S test is most sensitive around the median of a distribution,2) whereas the A-D test emphasizes the fit to the tails of a distribution.1) Therefore, K-S and A-D methods appear to be more suitable for estimating probabilistic exposure than other GOF methods. However, some methods often underestimated the 95th-percentile exposure for some CPs (e.g., mold-stain spray, car-coating spray, rust-inhibiting spray), producing values of less than 1. Underestimating the 95th-percentile values can lead to errors in risk assessments based on reasonable maximum exposures. According to the US EPA, when using a GOF method to obtain a distribution, the distribution function may vary depending on the number and the tail of the distribution of parent data.5) The number of raw data in the Exposure factor used in this study varied from 305 to 2,315, and the skewness of the distribution was also different.13) The difference in the appropriateness of the results may have originated from this difference in the parent distribution.

The results indicated that the use of different GOF methods can lead to significantly different outcome; i.e., an appropriate GOF method should be carefully selected for probabilistic exposure assessment. However, neither GOF method was the most appropriate in all cases, and the selection should be case-dependent. This study was the first study to compare the results of probabilistic exposure assessment according to the selection of the GOF methods. It was significant of this study to show that there is a significant difference in results depending on the selection of GOF method. Therefore, further research is needed to understand why these GOF methods produce different results.

The values calculated using the exposure algorithm did not represent actual internal exposure, because the fraction of a specific chemical in the products, inhalation rate, and emission rate were assumed to be 1. This was done to exclude variation due to chemical type. Therefore, we focused on outcomes of using different GOF methods to estimate distributions of exposure factors related to product-usage patterns in this study. How different GOF methods affect the accuracy of internal exposure estimation via probabilistic exposure assessment should be investigated in a separate study.

### IV. Conclusions

This study conducted the probabilistic exposure assessment using five GOF methods for 10 CPs which have the number of data more than 300. The results of the probabilistic exposure assessments were compared, and there was a significant difference according to the selection of GOF. Depending on the GOF method used, the 95th-percentile of the probabilistic exposure distribution differed up to 49.5 times from the true value. This difference may come from the number and the shape of parent distribution data. This study does not suggest that any GOF method is the best method for probabilistic exposure assessment of CPs. However, exposure scientists should carefully choose the GOF method when conducting probabilistic exposure assessments.

### Acknowledgments

This study was partially supported by the Basic Science Research Program through the NRF funded by the Ministry of Education, Science and Technology (NRF-2019R1A2C1083938).

Miyoung Lim (Research Professor), Kiyoung Lee (Professor)

### Fig 1.

Figure 1.Schematic of the process for comparison of individual exposure estimates and probabilistic exposure distributions
Journal of Environmental Health Sciences 2022; 48: 266-271https://doi.org/10.5668/JEHS.2022.48.5.266

Table 1 Statistics of individual exposure estimates for consumer products

ProductsNIndividual inhalation exposure (mg/kg/day)

MeanSD*50th75th95th99th
Mold-stain spray9240.17940.41040.04970.11490.73142.7926
Anti-fogging spray3050.00420.02190.00150.00390.01120.0194
Tire-shine spray3150.01050.02070.00490.01070.04150.0869
Car-coating spray6430.02720.05960.00770.02520.10790.3007
Glass-cleaning spray3500.01400.02420.00660.01580.05110.1202
Rain repelling spray3840.00460.00950.00170.00430.01890.0603
Liquid household bleach2,3152.53164.23991.10022.85069.442518.8850
Liquid washing-machine cleaner4420.01200.03040.00220.00850.05320.1597
Liquid drain cleaner1,8431.49514.15520.51041.37535.564414.1038
Rust-inhibiting spray3710.00400.01400.00030.00240.01980.0584

*SD, standard deviation.

Table 2 Ratios of probabilistic exposure estimates to individual exposure estimates using 50th- and 95th-percentile values

Products50th-percentile ratio95th-percentile ratio

Chi-squaredK-SA-DAICBICChi-squaredK-SA-DAICBIC
Mold-stain spray2.490.980.880.660.673.250.850.822.532.84
Anti-fogging spray0.710.790.850.610.601.661.601.662.052.09
Tire-shine spray1.601.220.810.420.4220.336.441.071.571.66
Car-coating spray0.851.081.100.850.881.191.020.951.231.26
Glass-cleaning spray1.730.840.860.780.7832.101.181.217.166.87
Rain-repelling spray1.291.021.030.500.5316.951.271.293.713.50
Liquid household bleach2.240.920.840.710.723.081.091.051.301.24
Liquid washing-machine cleaner1.101.000.970.710.701.371.601.9221.0321.22
Liquid drain cleaner1.370.910.910.690.7025.471.411.6749.5947.63
Rust-inhibiting spray1.050.770.770.610.610.880.730.735.304.58

K-S: Kolmogorov-Smirnov test, A-D: Anderson-Darling test, AIC: Akaike information criterion, BIC: Bayesian information criterion.

### References

1. U.S. Environmental Protection Agency. Risk Assessment Guidance for Superfund: Volume III - Part A, Process for Conducting Probabilistic Risk Assessment. Washington, DC: EPA; 2001. https://www.epa.gov/sites/default/files/2015-09/documents/rags3adt_complete.pdf
2. Cullen AC, Frey HC. Probabilistic Techniques in Exposure Assessment: A Handbook for Dealing with Variability and Uncertainty in Models and Inputs. New York: Plenum Press; 1999. https://www.worldcat.org/ko/title/39951660
3. U.S. Environmental Protection Agency. Guiding Principles for Monte Carlo Analysis. Washington, DC: EPA; 1997. https://www.epa.gov/sites/default/files/2014-11/documents/montecar.pdf
4. Chiew E, Cauthen K, Brown N, Nozick L. Comparison of distribution selection methods. Commun Stat Simul Comput. 2022; 51(4): 1982-2005. https://doi.org/10.1080/03610918.2019.1691227
5. U.S. Environmental Protection Agency. Report of the Workshop on Selecting Input Distributions for Probabilistic Assessments. Washington, DC: EPA; 1999. https://nepis.epa.gov/Exe/ZyNET.exe/30004ZPJ.TXT?ZyActionD=ZyDocument&Client=EPA&Index=1995+Thru+1999&Docs=&Query=&Time=&EndTime=&SearchMethod=1&TocRestrict=n&Toc=&TocEntry=&QField=&QFieldYear=&QFieldMonth=&QFieldDay=&IntQFieldOp=0&ExtQFieldOp=0&XmlQuery=&File=D%3A%5Czyfiles%5CIndex Data%5C95thru99%5CTxt%5C00000013%5C30004ZPJ.txt&User=ANONYMOUS&Password=anonymous&SortMethod=h%7C-&MaximumDocuments=1&FuzzyDegree=0&ImageQuality=r75g8/r75g8/x150y150g16/i425&Display=hpfr&DefSeekPage=x&SearchBack=ZyActionL&Back=ZyActionS&BackDesc=Results page&MaximumPages=1&ZyEntry=1&SeekPage=x&ZyPURL
6. Gilsenan MB, Lambe J, Gibney MJ. Assessment of food intake input distributions for use in probabilistic exposure assessments of food additives. Food Addit Contam. 2003; 20(11): 1023-1033.
7. Cai T, Xia Y, Zhou Y. Generalized inflated discrete models: a strategy to work with multimodal discrete distributions. Sociol Methods Res. 2021; 50(1): 365-400. https://doi.org/10.1177/0049124118782535
8. Muoka AK, Ngesa OO, Waititu AG. Statistical models for count data. Sci J Appl Math Stat. 2016; 4(6): 256-262. https://www.sciencepublishinggroup.com/journal/paperinfo?journalid=149&doi=10.11648/j.sjams.20160406.12
9. Böhning D. Zero-inflated poisson models and C.A.MAN: a tutorial collection of evidence. Biom J. 1998; 40(7): 833-843. https://doi.org/10.1002/(SICI)1521-4036(199811)40:7%3c833::AID-BIMJ833%3e3.0.CO;2-O
10. Karlis D, Ntzoufras I. Analysis of sports data by using bivariate Poisson models. J R Stat Soc Ser D Stat. 2003; 52(3): 381-393. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9884.00366
11. Olanrewaju RO. Integer-valued time series model via generalized linear models technique of estimation. Int Ann Sci. 2018; 4(1): 35-43. https://journals.aijr.org/index.php/ias/article/view/455
12. Dimitroulopoulou C, Lucica E, Johnson A, Ashmore MR, Sakellaris I, Stranger M, et al. EPHECT I: European household survey on domestic use of consumer products and development of worst-case scenarios for daily use. Sci Total Environ. 2015; 536: 880-889.
13. Park JY, Lim M, Yang W, Lee K. Exposure factors for cleaning, automotive care, and surface protection products for exposure assessments. Food Chem Toxicol. 2017; 99: 128-134.
14. National Institute of Environmental Research. Regulation of Purpose and Method of Concerned Products Risk Assessment. Incheon: National Institute of Environmental Research; 2016. https://www.law.go.kr/admRulLsInfoP.do?admRulSeq=2100000071414
15. National Institute of Environmental Research. Korean Exposure Factors Handbook. Incheon: National Institute of Environmental Research; 2019. https://www.nl.go.kr/NL/contents/search.do?srchTarget=total&pageNum=1&pageSize=10&kwd=%ED%95%9C%EA%B5%AD%EC%9D%B8%EC%9D%98+%EB%85%B8%EC%B6%9C%EA%B3%84%EC%88%98+%ED%95%B8%EB%93%9C%EB%B6%81#viewKey=696876600&viewType=AH1&category=%EB%8F%84%EC%84%9C&pageIdx=1&jourId=
16. Lim M, Park JY, Lim JE, Moon HB, Lee K. Receptor-based aggregate exposure assessment of phthalates based on individual's simultaneous use of multiple cosmetic products. Food Chem Toxicol. 2019; 127: 163-172.

### Vol.48 No.5 October, 2022

pISSN 1738-4087
eISSN 2233-8616

Frequency: Bimonthly