ESRA logo

Tuesday 16th July       Wednesday 17th July       Thursday 18th July       Friday 19th July      

Download the conference book

Download the program





Wednesday 17th July 2013, 14:00 - 15:30, Room: No. 20

Errors in social networks research designs

Convenor Dr Anja Znidarsic (Faculty of Organizational Sciences, University of Maribor)

Session Details

Social network data can be gathered in different ways where questionnaires, interviews, observations, archival records, and experiments could be used. Despite of the variety of network data collection techniques data are most frequently gathered by surveys. All methods have the potential to introduce different types of errors. Errors in the research design could be classified into three main categories: boundary specification problem, design of questionnaire and errors caused by actors. Boundary specification problem concerns rules of inclusion of actors in the network and consequently the problem of missing actors. Questionnaire format can be a source of errors due to fixed or free choice design format, recognition (where a complete list of units is used) or recall method, and direction of questions (e.g. measuring of providing or receiving social support). Errors in the third category are caused by actors: non-response of actors, non-response on particular tie(s) and measurement errors.

it is important to recognize the sources of errors in the research designs, to understand how certain types of errors can be reduced, and to assess the impact of errors on the results obtained with known tools in social network analysis.


Paper Details

1. Can structural equivalence reveal regularly connected cohesive subgroups?

Dr Anja Znidarsic (University of Maribor, Faculty of Organizational Sciences)
Dr Anuska Ferligoj (University of Ljubljana, Faculty of Social Sciences)
Dr Patrick Doreian (University of Pittsburgh, Department of Sociology and University of Ljubljana, Faculty of Social Sciences)

One of the purposes of social network analyses is to detect from large and seemingly incoherent networks a simple description of the fundamental structures of relationships. A popular and widely used technique for finding such structural patterns is generalized blockmodeling. The result of a blockmodeling procedure is a partition of actors determining positions and a reduced graph (also an image matrix) which represents the relationships (blocks) among the identified positions.

Actors are partitioned into clusters based on some type of equivalence The most commonly type of equivalence is structural equivalence and one generalization of this equivalence is regular equivalence.

Regardless the data collection techniques, all network data are prone to measurement errors. In binary networks measurement errors occurs when there are extra ties in and/or absent ties from the data, in contrast to the true underlying and unobservable structure.

Blockmodeling procedures are often used to reveal cohesive subgroups of actors within a network. The impact of measurement error on identified blockmodel structures is studied by simulations based on extended set of artificial networks. Three items are used in the simulations: a starting network; a cohesive subgroups model; different probabilities of ties.

The results show that structural equivalence is more stable in the face of measurement error and also helps us to determine if structural equivalence could be used also for regular equivalence when results using regular equivalence are unstable. We also attempt to determine which network properties (and/or blockmodel properties) have an impact on correctly identified blockmodel structures.



2. Actor non-response treatments in case of signed networks

Professor Patrick Doreian (University of Pittsburgh, Department of Sociology and University of Ljubljana, Faculty of Social Sciences)
Professor Anuška Ferligoj (University of Ljubljana, Faculty of Social Sciences)
Dr Anja Žnidaršič (University of Maribor, Faculty of Organizational Sciences)

While social networks often have information on the presence or absence of ties (binary networks), they can have also information on the strength of ties. If the valued ties are combined with positive or negative valences signed networks are the outcome. For example, in social networks positive ties represent liking, while negative ties represent dislike between actors.
Based on the structure theorems of structural balance, if balance is operative, we can identify subgroups where actors in each subgroup are positively connected among themselves where the subgroups are mutually hostile. Partitioning signed networks using structural balance is located within the generalized blockmodeling framework: the resulting image matrix has positive or null blocks on the diagonal while off-diagonal blocks have only negative or null ties (while allowing for some inconsistencies).
All network data, regardless their level of measurement (e.g. binary, valued, or signed), are likely to be measured with errors. One source of error takes the form of actor non-response. In the matrix representation of the network this means that we have a row of absent ties for each non-respondent, while incoming ties are available.
We used five simple treatments in the simulations. The first one is the complete-case approach where beside the row of absent ties for each non-respondent also the corresponding column is deleted and the result is smaller network. A null tie imputation procedure means that all absent ties are recorded as zero. If the modal value of incoming ties for a non-


3. On collecting the sport networks data from the web

Mr Kristijan Breznik (Mfdps)
Professor Vladimir Batagelj (FMF)

In recent years huge amounts of sports data became available on the web.
Sportsmen, teams and games played between them could be analyzed
using statistics and network analysis. Sportsmen or their teams represent
actors (vertices) in networks, different kind of games determine a variety
of relations between them. However, harvesting large data sets, such as
FIDE chess network for instance, with over hundreds of thousands of
vertices, we are usually faced with several technical and methodological
problems.

In this paper we present and discuss some of these problems such us:
- the boundary problem: the list of the actors to be considered in a
network;
- the actor identification problem: some actors can be represented with
different names; some (different) actors can be represented with the same
name;
- data entry errors: typing errors, duplicated records, etc.;
- technical problems: interruptions of harvesting process.

In the paper some solutions to these problems are proposed. The procedures
for collecting the network data from different web sources were written in
R, an open source programming language and software environment for
statistical computing and graphics.


4. The effects of measurement error on the structural properties of the citation networks

Miss Nuša Erman (UL, Faculty of Administration)
Mr Ljupco Todorovski (UL, Faculty of Administration)

Citation analysis takes at input a huge amount of bibliographical data that are often incomplete. This leads to the introduction of several measurement errors in the citation network, which, in turn, influence the results of citation network analysis. Such incompleteness of citation data most frequently derives from a number of identified and well-known problems, which occur in sources of citation data: 1) boundary specification problem, 2) multiple authorship, 3) self-citations, 4) implicit citations, 5) homographs, 6) synonyms, and 7) other. Some of these problems in sources of citation data can be abolished in the first stage of citation analysis, i.e., when determining the study design and before collecting data, whereas others emerge when data is already collected.
The aim of the paper is to study and compare the effects of the above-mentioned sources of measurement errors on the results of citation network analysis. More specifically, we first introduce different levels of incompleteness into the collected data to get a number of artificial incomplete data sets and transform each of them into citation network. We then perform comparative analysis of the values of the structural properties of the citation network obtained from the original data with the corresponding values for the citation network obtained from artificial incomplete data sets to check for the accuracy of the analysis results. Our study includes the structural network properties of density and prestige.