Pitfalls in data integration |
|
Coordinator 1 | Mr Philip Adebahr (Chemnitz University of Technology) |
Coordinator 2 | Mrs Sandra Jaworeck (Chemnitz University of Technology) |
During the COVID-19 pandemic, researchers carried out a large number of studies and generated the corresponding data. Data collected independently of each other cannot be analyzed together without further ado. Integration of data offers the perspective to connect and further analyze this data generating more insights to better understand what is happening, for example during the COVID-19 crisis. Despite this hope, data integration is connected with many pitfalls by combining different statistical tools like weighting, (multiple) imputations, data fusion, and data harmonization. For harmonization, we combine validity and reliability checks as well as (multiple) equating. Since there has been a lots of discussion on the methods and tools itself, little have we discussed limitations of their interplay.
Let’s draw an example: Equating is based on representative data . How do weighting procedures influence equating? To which extend are weighted harmonized data suitable for further data analysis? How does this influence our results? In this session, we will discuss the pitfalls of combining statistical methods of data integration, their interplay and order of implementation, their influence on our analysis and results, as well as possible solutions.
You need not necessarily deal with COVID-19 to be accepted in this session.