I am currently working on my MSc dissertation, which focuses on the long-term cost-effectiveness evaluation of semaglutide versus insulin glargine in the treatment of type 2 diabetes within the UK context. Due to the lack of access to real-world patient-level data, I have constructed a synthetic dataset that mimics the characteristics and longitudinal outcomes of a real-world cohort, including variables such as PatientID, Month, Group (treatment assignment), baseline age, sex, IMD quintile, HbA1c, BMI, QALY, cost, major adverse events, and discontinuation.
I am reaching out to kindly ask for help with drafting a robust and academically sound methodology section, specifically tailored to the use of simulated (virtual) data. Ideally, I am looking for guidance or a sample write-up that demonstrates how to describe:
- The construction of a synthetic panel dataset (eg, 400 patients followed monthly for 36 months, split between treatment and control groups)
- The rationale and justification for simulating clinical and economic variables (HbA1c, BMI, QALY, cost, event rates, etc.) based on published parameters
- The specific statistical/econometric methods to be used (panel regression, OLS, random/fixed effects, time interactions, sensitivity analysis, etc.)
- Any limitations and considerations unique to working with simulated data
If you have any example texts, templates, or are able to review a draft of my methodology section, I would be extremely grateful. Your input would greatly enhance the clarity, rigour, and academic value of my dissertation.
Please let me know if you would be willing to help, or if you can point me to useful references or resources.