本网站使用cookie,以便我们能够为您提供最佳的用户体验. Cookie information is stored in your browser and performs functions such as recognizing you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
联合统计会议(JSM), 将于8月8日至12日举行, 2021, 是世界上最大的统计事件之一吗. 超过13个专业协会参加, 这次会议汇集了6个国家,来自52个国家的500多名与会者, 80 +参展商, 1,超过1000名学生出席, 75多家雇主招聘200多个职位. 无论是虚拟的还是面对面的,韦斯特都会继续为这次活动提供前沿的研究.
“每年, JSM provides an excellent opportunity for all to come together and network with our colleagues from across the nation and around the globe,趣赢平台副总裁杰里·穆洛(Jeri Mulrow)说, 统计和评估科学主任. “面对2020年前所未有的挑战, 趣赢平台’s statisticians and data scientists are sharing innovative ideas and methods that have helped us push the field forward, 满足客户研究需求, 加强全民统计研究.”
了解更多我们如何帮助您应对项目挑战. 查阅专家指引(PDF), 这说明了我们的多模式能力和创新方法来推进数据收集科学.
今年, 本着本次活动“统计”主题的精神, Data, 以及他们讲述的故事, 我们在下面分享一些统计数据和数据故事. (注:*表示主讲人.)
周日, 8月8日, 2021: 3:30-5:20 pm (ET): An Approach to Estimate the Re-Identification Risk in Longitudinal Survey Microdata: Jianzhu Li*, 林李, 汤姆Krenzke
保护调查对象的机密数据至关重要. 统计数据产品发布前, 需要进行风险评估,以确保披露风险处于可接受的低水平. 在纵向调查中, 因为同样的受访者参与了不止一波的调查, 重新识别的风险通常高于横断面数据的风险. Common variables that do not change over time or change in patterns may allow the users to link up the records in individual files to form longitudinal records. 在这里, we share a survey example to demonstrate using the log-linear modeling approach to measure the re-identification risk while incorporating the longitudinal nature of the data, 相对于横向风险,哪一种方法可以衡量纵向风险的增加.
周二, 8月10日, 2021: 10-11:50 am (ET): Leveraging 行政 Data to Improve Child Passenger Safety: Elizabeth Petraglia*
行政, 或“发现,数据(税务记录), 传感器数据, 交易收据)已成为研究的共同资源. But alternative data sources can help researchers make more 细节ed implications often while improving statistical efficiency and decreasing cost. 我们将介绍, 例如, 国家数字汽车座椅检查表(NDCF), which collects 细节ed and user-friendly administrative data on child passenger safety (CPS) in the course of a typical car seat check. 在这里,我们探讨了NDCF的数据提供和一些创新方法, 比如基于可视化和个性化的仪表板, 用于将NDCF数据分发给具有不同数据素养的用户. 我们将NDCF与选定的cps相关调查和观察性研究进行比较, 根据覆盖范围评估每个来源的优势和局限性, 细节, 数据质量, 样本大小. 最后,我们将为该领域的实际应用提供建议, 包括将基于调查的数据与国家发展基金相结合的探索性工作.
周三, 8月11日, 2021: 1:30-3:20 pm (ET): Creating Base Weights and Replicate Weights for a PPS Sample with a Supplemental Sample When the Eligibility Frame Information Is Available After Sampling: 剑陈*, 伊斯梅尔·弗洛雷斯·塞万提斯, 迈克Kwanisai
Supplemental samples are used in surveys to increase sample sizes when there is a low response rate or high ineligibility. There are no straightforward methods for drawing supplemental samples for a systematic probability proportional to size (PPS) sample design after the main sample had been selected. We examine a situation where there was a large number of ineligible in the sampling frame known after the main sample selection but before drawing the supplemental sample. A non-overlapping supplemental sample was drawn by randomly offsetting the random start of the main sample interval. We explore and evaluate several methods for creating the base and replicate weights that properly reflect the variance estimates for this design. 最后,我们用蒙特卡罗模拟比较了这些方法的经验偏差和方差.
周三, 8月11日, 2021: 1:30-3:20 pm (ET): A Comparison of Two CHAID Packages for Modeling Survey Nonresponse: Tien-Huan林*, 卡洛斯Arieira, 伊斯梅尔·弗洛雷斯·塞万提斯, 迈克Kwanisai
当涉及到单元无响应时, it is common practice to lessen bias by modeling response propensity and adjusting weights to account for different response propensities. The CHAID (Chi-square Automatic Interactive Detector) algorithm is commonly used to produce weighting classes for this purpose, which brings us to the analysis of 2 popular software packages that implement the CHAID algorithm: SI-CHAID and HPSPLIT. We will describe the pros and cons of the 2 packages in terms of the resulting bias and variance of the weighted estimates by using simulations of a complex survey sample design to examine the packages’ interchangeability.
周三, 8月11日, 2021: 1:30-3:20 pm (ET): Evaluation of Methods to Form Segments from Census Blocks in Area Sample Designs: 詹妮弗·卡利*, 汤姆Krenzke, 陈应, 剑陈, 吉姆·格林
In-person surveys often use a multistage sample design in which households are sampled within geographic areas called segments, 通过限制数据收集器的地理范围来提高成本效率. 通常,区段是由相邻的人口普查块分组形成的. 合并相邻人口普查块的一个简单方法是按人口普查块ID对人口普查块文件进行排序, 这通常会创建不连续的段, 不完整(含孔), 不紧凑. Issues with contiguity and 完整性 create challenges for data collectors in determining which housing units to include in the sample frame. 不紧凑的部分增加了面试官的旅行成本. 我们将回顾形成分段的其他方法, 根据邻近度对每一种排序方法所形成的片段进行评价, 完整性, 密实度, 段间方差, 并将介绍使用所有4种排序方法的分段形成算法.
周三, 8月11日, 2021: 1:30-3:20 pm (ET): Modeling Survey Nonresponse Under a Cluster Sample Design: Classification and Regression Tree Methodologies Compared: Michael Jones*, 威廉·埃弗雷特·塞西尔, Tien-Huan林, 詹妮弗·卡利, 伊斯梅尔·弗洛雷斯·塞万提斯
计算调查权值,用于分析复杂的抽样调查数据, 通常对无响应进行调整以减少估计的偏差. Many algorithms and methodologies are available to you for modeling survey nonresponse for these adjustments. 最好的方法是什么? 在处理复杂的聚类样本设计时,我们深入研究并比较选择的算法. We also evaluate the effect of the classification tree-based methods on the reduction of nonresponse bias in high-response and low-response 设置, 并研究了这些方法在调整调查权重时的性能. What are the benefits and limitations of using these methods for estimating response propensities in surveys that use a cluster sample? 我们也会讨论它们.