# Copulas: Summer Course Project

Stuck on a homework question? Our verified tutors can answer all questions, from basic math to advanced rocket science!

## Objective

The main objective is to learn how to apply copulas to estimate the joint distributions. You will also learn related subjects of generating univariate random numbers and correlated random variables and to create scenarios and use them in Monte Carlo simulations of the portfolio value.

## Code Example

The example Python code is provided for your convenience: “project-copula-template.html” file.
The example covers main steps of the copula work flow, but the key pieces of code are omitted, see “???” in the code snippets. You will have to figure them out yourself based on the instructions given during the lecture. The code is compiled in the Python Jupyter style notebook.

## What to submit

You will have to produce up to one page long write up. The figures and tables should go to Appendix, and are not included into a page count. Accompany your report with a code or any other supporting
materials. You can use either Python or Excel to produce the analysis required. You can use Jupyter or plain Python code. Ensure that all required packages are in import statements.

## Description

You have a portfolio of four assets:
• SPY, 120 shares
• VXX, 70 shares
• USO, 60 shares
• GLD, 20 shares
These are ETFs that are often used by retail investors to gain exposure to major market factors/segments: equity prices and volatilities, oil and gold commodities.
You are provided with one year long time series of these tickers’ prices. The sampling is daily, you should use adjusted closing prices. You must fit a Gaussian copula with Student t marginals to the price series, and forecast the distribution of portfolio P&L for the next day after March 11 2020. You must assess the quality of your forecast using the tools that we shall learn during this course.

## Step-by-step instructions and grading rubric

Load data and construct the data set, 1 point
Data is provided in four CSV files. It was downloaded from Yahoo Finance. Load the files, join the series by date and use adjusted closing price column. You are modeling log returns: 𝑟𝑡 = ln 𝑝𝑡/𝑝𝑡−1, where 𝑡 is an observation day and 𝑝𝑡 – adjusted closing price. Chart the series to make sure the data was loaded properly, e.g. as follows:
Produce descriptive statistics such as min, max, median and four moments (mean, standard deviation, skewness and kurtosis). When using kurtosis clearly indicate whether it’s excess kurtosis or not.

Python code example is provided

Estimate Student t marginals, 1 point
Fit Student t distribution to each factor. Use location scale variant of it. The distribution has three parameters: degrees of freedom, location and scale. Save the parameters for use in further steps.

Python code example is provided.

Assess goodness-of-fit of marginals, 2 points
You may assess the fit visually by producing a histogram such as follows:
Pay attention to the degrees of freedom. Comment on its value: is it too low? What are implications?

You can also use statistical goodness-of-fit metrics such as 𝜒2 test.
Partial Python code example is provided.
Calibrate Gaussian Copula, 10 points

Follow the work flow as discussed during the class for canonical maximum likelihood approach. Here’s the outline. Gaussian copula calibration boils down to deducing its correlation matrix parameter from the rank correlation matrix of the factors:
• Obtain Spearman correlation matrix of input data
• Deduce the Copula correlation matrix parameter 𝑅̂ by applying transformation 2 sin (𝜋6𝜌) to offdiagonal elements 𝜌 of Spearman correlation matrix

Partial Python code example is provided.

Simulate factor scenarios, 3 points
Follow the work flow as discussed during the class for simulation step. Here’s the outline. Simulation with Gaussian copula boils down to generating correlated Gaussian random variables, then converting
them into marginal distributions by inverse CDF method application:
• Generate correlated standard normal random variables (Xcs in the code example) using
Cholesky decomposition 𝐿 of correlation matrix 𝑅̂
• Convert the standard normal into uniform distribution variables (Usim in the code example) by
applying CDF of standard normal distribution 𝐹(𝑋)
• Convert uniform numbers into marginal Student t distribution variables (Xsim in the code
example) by applying inverse of fitted marginals 𝐹𝑡 −1 (𝑈)
The simulated variables that you produced comprise the scenarios of factors.

Partial Python code example is provided.

Assess goodness-of-fit of Copula, 4 points
The assessment can be visual, e.g. produce the historical scatter plots of VXX vs SPY as well as their uniform versions obtained by applying CDF of marginals and compare them with the simulated scenarios as shown on figure below.
The assessment can be more objective and quantitative by comparing descriptive statistics of historical and simulated scenarios.
Finally, assessment can be statistical using goodness-of-fit tests.

Partial Python code example is provided.

Forecast P&L distribution of a portfolio for next day, 2
Using generated scenarios, produce a forecast of asset prices for the next day and evaluate the portfolio value for each scenario. For instance, if the asset return in the scenario is 𝑟𝑖 then it’s price forecast is
𝑝1 = 𝑝0 × 𝑒 𝑟𝑖.

Knowing the holdings of each asset and initial prices it’s trivial to evaluate the market value of the portfolio in the scenario 𝑉𝑖 and the P&L Δ𝑉𝑖 = 𝑉𝑖 − 𝑉0. Once you ran multiple scenarios you have a set of P&Ls Δ𝑉𝑖 to which you can fit the P&L distribution.

For this project it is sufficient to simply produce 99 percentile loss figure, i.e. the loss amount 𝑉𝑎𝑅𝛼 such that it is 𝛼 = 0.01 or 1% probability to lose more than 𝑉𝑎𝑅𝛼 over one day. However, you may attempt to fit parametric distribution to P&Ls and estimate 𝑉𝑎𝑅𝛼 from it.
Partial Python code example is provided.
Assess convergence of P&L distribution forecast, 2
Assess the quality of your P&L distribution forecast in particular of the 𝑉𝑎𝑅𝛼 value that obtained.
At the very least, you may comment on the convergence of 𝑉𝑎𝑅𝛼 with increased number of simulations.
However, a more sophisticated analysis would also include consideration of the marginal distribution and copula fits.