Document Extraction Form

Select the variables needed for your panel data analysis.

Select quantitative variables...
Select qualitative variables...
Back to Blog
Thesis Guide·2026-06-25·15 mins

A Complete Guide to Collecting Panel Data for Accounting Research

Tactical and methodological steps to collect panel data from publicly listed companies. Covers purposive sampling, data sources, and merging techniques.

Panel data (longitudinal data) is a combination of time series and cross-sectional observations. In Indonesian accounting research literature, panel data is highly favored due to its ability to control for unobserved individual heterogeneity and provide richer data variability.

1. Sample Selection (Purposive Sampling)

The fundamental first step is establishing sample criteria. The most common criteria include: (1) Companies listed on the Stock Exchange consecutively during the observation period, (2) No delisting, (3) Publishing financial statements in a uniform currency, and (4) Possessing complete data for all observation variables. The financial sector (like banks and insurance) is often excluded from general samples due to differences in regulatory structure and financial reporting.

2. Secondary Data Sources

There are several primary databases for stock exchange research:

  • Official Exchange Websites: The primary source for official Annual Report PDFs, though downloading often must be done one by one.
  • Financial Terminals (Bloomberg / Thomson Reuters OSIRIS): Highly comprehensive and expensive, generally only available at elite university libraries.
  • NgepetData: A smart alternative allowing students to upload PDFs and extract custom ratios as if they had a personal financial terminal.

3. Data Management & Outliers

After data is collected into Excel format, the next challenge is data cleaning. Researchers must perform winsorization at the 1% and 99% levels to dampen extreme effects (outliers) without discarding data. Merging data from various sources must also be executed using unique keys such as Ticker Symbol and Fiscal Year.

4. Estimation Model Selection

Panel data analysis is generally tested through three models: Pooled OLS, Fixed Effect Model (FEM), and Random Effect Model (REM). Students are required to run the Chow Test (to choose between OLS vs FEM), Hausman Test (FEM vs REM), and Lagrange Multiplier Test (OLS vs REM) to determine the most robust model specification for testing research hypotheses.

#Panel Data#Thesis#Research Methodology#Purposive Sampling