{ "cells": [ { "cell_type": "markdown", "id": "9d749fa9", "metadata": {}, "source": [ "*******************************************************************************\n", "** Please acknowlegde the source\n", "** Boermans, M.A. (2022). A literature review of Securities Holdings Statistics research and a practitioner's guide. DNB Working Paper No 757.\n", "** Version 2.0 (XX August 2024) (First version: 8 December 2022)\n", "*******************************************************************************" ] }, { "cell_type": "code", "execution_count": null, "id": "2c5ffe55", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "file = \"F32_2024Q1\"\n", "*! check file extension or selection of F31_2024Q1 or F511_2024Q1 or F52E_2024Q1\n", "df = pd.read_stata(file)\n" ] }, { "cell_type": "code", "execution_count": null, "id": "bba48748", "metadata": {}, "outputs": [], "source": [ "* use \t\"F32_2024Q1.dta\", clear\t\t\n", "* use \t\"F31_2024Q1.dta\", clear\t\n", " \n", " * stocks and investment funds, edit quotation_basis filter ad 6.\n", "\t* use \t\"F511_2024Q1.dta\", clear\t\t\n", "\t* use \t\"F52E_2024Q1.dta\", clear\t" ] }, { "cell_type": "markdown", "id": "12281681-d784-4b95-8692-2ef74d16def5", "metadata": {}, "source": [ "##### Data preperation and selection\n", "\n", "Selecting:\n", "\n" ] }, { "cell_type": "code", "execution_count": 147, "id": "b9dec650-e702-489a-8b03-7c6267397756", "metadata": {}, "outputs": [], "source": [ "df = df[df['amount_type']=='LE']\n", "\n", "df = df[df['valuation']=='M']\n", "\n", "df = df[(df['holder_area']=='AT') | (df['holder_area']=='BE') | (df['holder_area']=='CY') | (df['holder_area']=='DE') | (df['holder_area']=='EE') | (df['holder_area']=='ES') | (df['holder_area']=='FI') | (df['holder_area']=='FR') | (df['holder_area']=='GR') | (df['holder_area']=='HR') | (df['holder_area']=='IE') | (df['holder_area']=='IT') | (df['holder_area']=='LT') | (df['holder_area']=='LU') | (df['holder_area']=='LV') | (df['holder_area']=='MT') | (df['holder_area']=='NL') | (df['holder_area']=='PT') | (df['holder_area']=='SI') | (df['holder_area']=='SK')]\n", "\n", "df = df[~((df['tph']=='N') & ((df['holder_sector']=='S_14') & (df['holder_sector']=='S_15') & (df['holder_sector']=='S_1MU')))]" ] }, { "cell_type": "markdown", "id": "ca38be04-0ceb-43d0-8bab-c45a9d042404", "metadata": {}, "source": [ "##### Data cleaning\n", "Dropping:\n", "" ] }, { "cell_type": "code", "execution_count": 148, "id": "f682e3af-7b0e-44ba-b443-b2a83288cc10", "metadata": {}, "outputs": [], "source": [ "df = df[~(df['functional_category']=='D')]\n", "\n", "df = df[df['obs_value']>0]\n", "\n", "df = df[df['security_status']==100]\n", "\n", "df = df[(df['issuer_country']!='VI') | (df['issuer_country']!='CW') | (df['issuer_country']!='KY') | (df['issuer_country']!='BS') | (df['issuer_country']!='BM') | (df['issuer_country']!='VG') | (df['issuer_country']!='IM') | (df['issuer_country']!='MH') | (df['issuer_country']!='GG') | (df['issuer_country']!='GI') | (df['issuer_country']!='JE') | (df['issuer_country']!='LI')] \n", "\n", "df = df[~(df['amount_out_eur']<100000000)]\n", "\n", "df = df[~(df['debt_type']=='D.18')]\n", "\n", "df = df[~(df['quotation_basis']=='PCL')]" ] }, { "cell_type": "markdown", "id": "839c005a-2e88-43d3-96d0-fb04d0b19ef7", "metadata": {}, "source": [ "##### New holder sector definition:" ] }, { "cell_type": "code", "execution_count": 149, "id": "5bb77867-ce55-47f3-97f9-aaab447a1ebd", "metadata": {}, "outputs": [], "source": [ "df.loc[(df['holder_sector']=='S_122'), 'holder_sector_new'] = 'banks'\n", "df.loc[(df['holder_sector']=='S_128'), 'holder_sector_new'] = 'insur'\n", "df.loc[(df['holder_sector']=='S_129'), 'holder_sector_new'] = 'pfund'\n", "df.loc[(df['holder_sector']=='S_123') | (df['holder_sector']=='S_124'), 'holder_sector_new'] = 'invfd'\n", "df.loc[(df['holder_sector']=='S_14') | (df['holder_sector']=='S_15') | (df['holder_sector']=='S_1MU'), 'holder_sector_new'] = 'hhold'\n", "df.loc[(df['holder_sector']=='S_125A') | (df['holder_sector']=='S_125W'), 'holder_sector_new'] = 'omfis'\n", "df.loc[(df['holder_sector']=='S_1311') | (df['holder_sector']=='S_1312') | (df['holder_sector']=='S_1313') | (df['holder_sector']=='S_1314') | (df['holder_sector']=='S_13U'), 'holder_sector_new'] = 'gov'\n", "df.loc[(df['holder_sector']=='S_11'), 'holder_sector_new'] = 'nonfc'" ] }, { "cell_type": "markdown", "id": "fd65b6ec-c315-47f5-a25c-f48eea3f4257", "metadata": {}, "source": [ "Deleting the old holder sector variable, and the rows for which the new holder sector variable is unknown:" ] }, { "cell_type": "code", "execution_count": null, "id": "62445ac3", "metadata": {}, "outputs": [], "source": [ "df = df[~(df['holder_sector_new']=='nan')]\n", "del df['holder_sector']" ] }, { "cell_type": "markdown", "id": "5b467ba9-5631-4695-b20f-d1f7cfdfb829", "metadata": {}, "source": [ "This next cell is not really necessary, as grouping over a string variable is possible in Python" ] }, { "cell_type": "code", "execution_count": 151, "id": "d8930365-9d9e-48a9-aa78-cbe040ed4195", "metadata": {}, "outputs": [], "source": [ "# df['holder_sector'] = df['holder_sector_new']\n", "# labels, levels = pd.factorize(df['holder_sector'])\n", "# df['holder_sector'] = labels\n", "# df['holder_sector'].head()\n", "# Dit verandert de holder_sector naar getallen ipv naar een categorical versie (wel letters, maar geen string)" ] }, { "cell_type": "markdown", "id": "1b65fffe-6a03-41b5-a869-765d2a7f9854", "metadata": {}, "source": [ "##### Data aggregation: holder country - holder sector level aggregation" ] }, { "cell_type": "markdown", "id": "7945958d-8fed-496c-b854-0347b1a2ec6f", "metadata": {}, "source": [ "Create an id-number for every new combination of the included variables.\\\n", "```ngroup()``` numbers each group. The numbers match the order in which the groups would be seen when interating over the groupby object, not the order they are first observed." ] }, { "cell_type": "code", "execution_count": 152, "id": "f09739dc-4fb3-4590-9711-79754e110f8d", "metadata": {}, "outputs": [], "source": [ "df['id1'] = df.groupby(['identifier', 'holder_area', 'holder_sector_new', 'period']).ngroup()" ] }, { "cell_type": "markdown", "id": "90452798-a3a4-421d-9cbf-1556c93894f2", "metadata": {}, "source": [ "For every group of id1, create a new variable 'temp', that is the sum of the obs_value's in that group" ] }, { "cell_type": "code", "execution_count": 153, "id": "a34bcf85-6cf4-46c9-9ac7-6e15f0295a67", "metadata": {}, "outputs": [], "source": [ "df['temp'] = df.groupby(['id1'])['obs_value'].transform('sum')" ] }, { "cell_type": "markdown", "id": "f5f5db15-bf4b-4016-9a69-028c2e24aa3c", "metadata": {}, "source": [ "Replace obs_value with temp (if the two are the same (only one id1), obs_value does not change)" ] }, { "cell_type": "code", "execution_count": 154, "id": "ac1609aa-ac87-4f9a-9812-be530596c612", "metadata": {}, "outputs": [], "source": [ "df['obs_value'] = df['temp']" ] }, { "cell_type": "markdown", "id": "55238a20-0cb4-40ee-94bd-8f0faa47ae9f", "metadata": {}, "source": [ "For every group of id1, create a new variable 'tempvar', that is equal to 1 and increases by 1 as there are more id1's in the group" ] }, { "cell_type": "code", "execution_count": 155, "id": "ec4c2cf3-2c12-42c6-9ad3-cb24fe6f6121", "metadata": {}, "outputs": [], "source": [ "df['tempvar'] = df.groupby('id1').cumcount() + 1" ] }, { "cell_type": "markdown", "id": "6411cbf8-6382-4df9-b273-d83d5ecc753d", "metadata": {}, "source": [ "Drop tempvar if its value is higher than 1: this ensures that of every id1, only 1 is kept" ] }, { "cell_type": "code", "execution_count": 156, "id": "25577dc0-0c73-4d5c-a6fb-68ec17582f65", "metadata": {}, "outputs": [], "source": [ "df = df[~(df['tempvar']>1)]" ] }, { "cell_type": "markdown", "id": "d6baea40-27b2-4b98-9fdb-28be791272db", "metadata": {}, "source": [ "Drop the variables of this aggregation stage" ] }, { "cell_type": "code", "execution_count": 157, "id": "b940a973-dd64-41e5-bd2b-54c8a42bd50a", "metadata": {}, "outputs": [], "source": [ "df = df.drop(['temp', 'tempvar', 'id1', 'functional_category'], axis=1)" ] }, { "cell_type": "code", "execution_count": 158, "id": "4b01f6c7-15fd-40e2-be21-8a8769e074b1", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(32868, 87)" ] }, "execution_count": 158, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.shape" ] }, { "cell_type": "markdown", "id": "031e2466-9534-4548-8f67-50d9f9ea7556", "metadata": {}, "source": [ "##### Final cleaning after data aggregations\n", "\n", "Specific for bond and money markets - otherwise use market_cap_eur for F511 and F52" ] }, { "cell_type": "code", "execution_count": 159, "id": "762a3bee-4b2e-466f-bbce-d5adbc0c7a89", "metadata": {}, "outputs": [], "source": [ "df.loc[((df['obs_value'] > df['amount_out_eur']) & (~df['obs_value'].isna()) & (~df['amount_out_eur'].isna())), 'obs_value'] = df['amount_out_eur']\n", "\n", "df = df[df['obs_value'] > 10000]" ] }, { "cell_type": "markdown", "id": "052efb00-25ca-433e-a9e8-103543fe9dc2", "metadata": {}, "source": [ "Drop irrelevant variables:" ] }, { "cell_type": "code", "execution_count": 160, "id": "ee6b079f-47ce-4b72-85de-898130fd27c9", "metadata": {}, "outputs": [], "source": [ "df = df.drop(['identifier_type', 'instr_class', 'dividend_amount', 'dividend_amount_type', 'dividend_currency', 'dividend_settl_date', 'last_split_factor', 'last_split_date', 'market_cap_eur', 'market_cap', 'conf_status', 'conf_status_calc', 'sensitivity', 'escb', 'accr_interest', 'accr_income_factor', 'issue_price', 'last_coupon_date', 'last_coupon_freq', 'poolfactor', 'orig_mat_days', 'orig_mat_bracket', 'unit_measure'], axis=1)\n", "placeholder_drop = df.columns[df.columns.str.startswith('placeholder')]\n", "df.drop(columns=placeholder_drop, inplace=True)" ] }, { "cell_type": "markdown", "id": "564f5fa9-5836-4123-970a-a9ab52eaa240", "metadata": {}, "source": [ "##### Variable generation" ] }, { "cell_type": "code", "execution_count": 174, "id": "c4c443ec-4b44-421f-a255-f4d158c62ad5", "metadata": {}, "outputs": [], "source": [ "df['hold_ln'] = np.log(df['obs_value'])" ] }, { "cell_type": "markdown", "id": "678de4a9-747d-4db8-9aef-c6f307d276fa", "metadata": {}, "source": [ "```np.percentile(a, q)```: returns the q-th percentile of a \\\n", "```np.clip(a, a_min, a_max)```: a_min and a_max determine the interval, values of a that are not in the interval are clipped at the interval edges" ] }, { "cell_type": "code", "execution_count": 161, "id": "8b0c0fb6-bec7-4bb6-bbdd-b7f722521426", "metadata": {}, "outputs": [], "source": [ "def winsorize(series, lower_pct=5, upper_pct=95):\n", " lower_bound = np.percentile(series, lower_pct)\n", " upper_bound = np.percentile(series, upper_pct)\n", " return np.clip(series, lower_bound, upper_bound)\n", "\n", "df['price_value_w'] = winsorize(df['price_value'])\n", "\n", "df['mv'] = (df['amount_out_eur'] * df['price_value_w']) / 100\n", "\n", "df['mvsize_ln'] = np.log(df['mv'])\n", "\n", "df['share'] = df['obs_value'] / df['mv']" ] }, { "cell_type": "code", "execution_count": 162, "id": "1a033ba9-1cdb-42fd-aa09-7d4db9f298cf", "metadata": {}, "outputs": [], "source": [ "df['EUR'] = np.nan\n", "df.loc[df['nom_curr']==\"EUR\", 'EUR'] = 1\n", "df.loc[~(df['nom_curr']==\"EUR\"), 'EUR'] = 0" ] }, { "cell_type": "code", "execution_count": 163, "id": "d2213e2d-e079-49ff-b700-54082ca181e6", "metadata": {}, "outputs": [], "source": [ "df['home'] = np.nan\n", "df.loc[df['holder_area']==df['issuer_country'], 'home'] = 1\n", "df.loc[~(df['holder_area']==df['issuer_country']), 'home'] = 0" ] }, { "cell_type": "code", "execution_count": 164, "id": "69f4cde5-51ae-4dc7-8f2f-2af43e473072", "metadata": {}, "outputs": [], "source": [ "df['euroarea'] = 0\n", "df.loc[(df['issuer_country']=='AT') | (df['issuer_country']=='BE') | (df['issuer_country']=='DE') | (df['issuer_country']=='EE') | (df['issuer_country']=='FI') | (df['issuer_country']=='FR') | (df['issuer_country']=='LT') | (df['issuer_country']=='LU') | (df['issuer_country']=='LV') | (df['issuer_country']=='MT') | (df['issuer_country']=='NL') | (df['issuer_country']=='CY') | (df['issuer_country']=='ES') | (df['issuer_country']=='GR') | (df['issuer_country']=='IE') | (df['issuer_country']=='IT') | (df['issuer_country']=='PT') | (df['issuer_country']=='SI') | (df['issuer_country']=='HR'), 'euroarea'] = 1" ] }, { "cell_type": "code", "execution_count": 165, "id": "1b275853-22be-41ad-9f40-0c96b76e183d", "metadata": {}, "outputs": [], "source": [ "df['holder_area_num'] = df.groupby(['holder_area']).ngroup()\n", "df['holder_sector_num'] = df.groupby(['holder_sector_new']).ngroup()\n", "df['issuer_country_num'] = df.groupby(['issuer_country']).ngroup()" ] }, { "cell_type": "markdown", "id": "e85c08a2-d4ec-4336-958e-de3a13fa9188", "metadata": {}, "source": [ "##### Benchmark investor sectors: gov, nonfc" ] }, { "cell_type": "code", "execution_count": 166, "id": "ed424918-fc3d-4b74-923d-39ad32ae5d6f", "metadata": {}, "outputs": [], "source": [ "df['EUR_banks'] = 0\n", "df.loc[(df['holder_sector_new']=='banks') & (df['EUR']==1), 'EUR_banks'] = 1\n", "\n", "df['EUR_hhold'] = 0\n", "df.loc[(df['holder_sector_new']=='hhold') & (df['EUR']==1), 'EUR_hhold'] = 1\n", "\n", "df['EUR_insur'] = 0\n", "df.loc[(df['holder_sector_new']=='insur') & (df['EUR']==1), 'EUR_insur'] = 1\n", "\n", "df['EUR_omfis'] = 0\n", "df.loc[(df['holder_sector_new']=='omfis') & (df['EUR']==1), 'EUR_omfis'] = 1\n", "\n", "df['EUR_invfund'] = 0\n", "df.loc[(df['holder_sector_new']=='invfund') & (df['EUR']==1), 'EUR_invfund'] = 1\n", "\n", "df['EUR_pfund'] = 0\n", "df.loc[(df['holder_sector_new']=='pfund') & (df['EUR']==1), 'EUR_pfund'] = 1" ] }, { "cell_type": "code", "execution_count": 167, "id": "f25704b3-813a-409e-82fe-e83b3ffd697e", "metadata": {}, "outputs": [], "source": [ "df['euroarea_banks'] = 0\n", "df.loc[(df['holder_sector_new']=='banks') & (df['euroarea']==1), 'euroarea_banks'] = 1\n", "\n", "df['euroarea_hhold'] = 0\n", "df.loc[(df['holder_sector_new']=='hhold') & (df['euroarea']==1), 'euroarea_hhold'] = 1\n", "\n", "df['euroarea_insur'] = 0\n", "df.loc[(df['holder_sector_new']=='insur') & (df['euroarea']==1), 'euroarea_insur'] = 1\n", "\n", "df['euroarea_omfis'] = 0\n", "df.loc[(df['holder_sector_new']=='omfis') & (df['euroarea']==1), 'euroarea_omfis'] = 1\n", "\n", "df['euroarea_invfund'] = 0\n", "df.loc[(df['holder_sector_new']=='invfund') & (df['euroarea']==1), 'euroarea_invfund'] = 1\n", "\n", "df['euroarea_pfund'] = 0\n", "df.loc[(df['holder_sector_new']=='pfund') & (df['euroarea']==1), 'euroarea_pfund'] = 1" ] }, { "cell_type": "code", "execution_count": 168, "id": "688515f8-8383-4043-be0c-2c4db63c6613", "metadata": {}, "outputs": [], "source": [ "df['mvsize_ln_banks'] = 0\n", "df['mvsize_ln_banks'] = df['mvsize_ln_banks'].astype(float)\n", "df.loc[(df['holder_sector_new']=='banks'), 'mvsize_ln_banks'] = df[\"mvsize_ln\"] \n", "\n", "df['mvsize_ln_hhold'] = 0\n", "df['mvsize_ln_hhold'] = df['mvsize_ln_hhold'].astype(float)\n", "df.loc[(df['holder_sector_new']=='hhold'), 'mvsize_ln_hhold'] = df[\"mvsize_ln\"] \n", "\n", "df['mvsize_ln_insur'] = 0\n", "df['mvsize_ln_insur'] = df['mvsize_ln_insur'].astype(float)\n", "df.loc[(df['holder_sector_new']=='insur'), 'mvsize_ln_insur'] = df[\"mvsize_ln\"] \n", "\n", "df['mvsize_ln_omfis'] = 0\n", "df['mvsize_ln_omfis'] = df['mvsize_ln_omfis'].astype(float)\n", "df.loc[(df['holder_sector_new']=='omfis'), 'mvsize_ln_omfis'] = df[\"mvsize_ln\"] \n", "\n", "df['mvsize_ln_invfund'] = 0\n", "df.loc[(df['holder_sector_new']=='invfund'), 'mvsize_ln_invfund'] = df[\"mvsize_ln\"] \n", "\n", "df['mvsize_ln_pfund'] = 0\n", "df['mvsize_ln_pfund'] = df['mvsize_ln_pfund'].astype(float)\n", "df.loc[(df['holder_sector_new']=='pfund'), 'mvsize_ln_pfund'] = df[\"mvsize_ln\"] " ] }, { "cell_type": "code", "execution_count": 169, "id": "9d6eeb4d-40e5-468b-b1f5-1ff9a7bf39e4", "metadata": {}, "outputs": [], "source": [ "df['id_panel_sj'] = df['identifier'].astype(str) + ' ' + df['holder_area'].astype(str) + ' ' + df['holder_sector_new'].astype(str)\n", "df['id_panel_cluster'] = df['holder_area'].astype(str) + ' ' + df['holder_sector_new'].astype(str)" ] }, { "cell_type": "markdown", "id": "8350d0fa-157f-499f-bd44-f6c5e8dc42b0", "metadata": {}, "source": [ "##### Regression" ] }, { "cell_type": "markdown", "id": "02fd33c7-301d-400e-aa5d-bb8c6fc2505c", "metadata": {}, "source": [ "Creating a qdate variable" ] }, { "cell_type": "code", "execution_count": 179, "id": "288a79a8-4be3-4081-9285-c20af7b4f8fe", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
periodyearquarterqdateqdate_str
id_panel_sjqdate
US961214EF61 NL invfd2024-01-012024-Q12024Q12024-01-012024Q1
FR0013398831 DE gov2024-01-012024-Q12024Q12024-01-012024Q1
PTEGPAOM0017 LU invfd2024-01-012024-Q12024Q12024-01-012024Q1
XS2675884576 LU hhold2024-01-012024-Q12024Q12024-01-012024Q1
US05578BAJ52 IT pfund2024-01-012024-Q12024Q12024-01-012024Q1
\n", "
" ], "text/plain": [ " period year quarter qdate qdate_str\n", "id_panel_sj qdate \n", "US961214EF61 NL invfd 2024-01-01 2024-Q1 2024 Q1 2024-01-01 2024Q1\n", "FR0013398831 DE gov 2024-01-01 2024-Q1 2024 Q1 2024-01-01 2024Q1\n", "PTEGPAOM0017 LU invfd 2024-01-01 2024-Q1 2024 Q1 2024-01-01 2024Q1\n", "XS2675884576 LU hhold 2024-01-01 2024-Q1 2024 Q1 2024-01-01 2024Q1\n", "US05578BAJ52 IT pfund 2024-01-01 2024-Q1 2024 Q1 2024-01-01 2024Q1" ] }, "execution_count": 179, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[['year', 'quarter']] = df['period'].str.split('-', expand=True)\n", "df['qdate'] = pd.to_datetime(df['year'].astype(str) + df['quarter'].astype(str))\n", "df['qdate_str'] = df['qdate'].dt.to_period('Q').astype(str)\n", "df[['period', 'year', 'quarter', 'qdate', 'qdate_str']].head()" ] }, { "cell_type": "code", "execution_count": 171, "id": "6df8e1a4-afd8-463d-b056-937a380f1f8a", "metadata": {}, "outputs": [], "source": [ "df.set_index(['id_panel_sj', 'qdate'], inplace=True)" ] }, { "cell_type": "code", "execution_count": 175, "id": "ad91672d-ce77-4f20-aae9-fb46fe2e42f5", "metadata": {}, "outputs": [], "source": [ "df['hold_ln_l'] = df['hold_ln'].shift()" ] }, { "cell_type": "markdown", "id": "8bcb303d-1842-43be-8d13-11e6a00e3958", "metadata": {}, "source": [ "Drop any missing values (this has to be done manually, otherwise the program will delete all missing values, and no values remain?)" ] }, { "cell_type": "code", "execution_count": 176, "id": "47108a64-9f27-49d2-82cf-26893473c509", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(30164, 88)" ] }, "execution_count": 176, "metadata": {}, "output_type": "execute_result" } ], "source": [ "columns_used = ['hold_ln', 'EUR', 'mvsize_ln', 'holder_area_num', 'holder_sector_num', 'issuer_country_num', 'id_panel_cluster']\n", "df_clean = df.dropna(subset=columns_used)\n", "df_clean.shape" ] }, { "cell_type": "code", "execution_count": null, "id": "aa1e2634", "metadata": {}, "outputs": [], "source": [ "import statsmodels.api as sm\n", "import statsmodels.formula.api as smf\n", "\n", "formula = 'hold_ln ~ EUR + mvsize_ln + C(holder_area_num) + C(holder_sector_num) + C(holder_area_num)*holder_sector_num + C(issuer_country_num)'\n", "\n", "model = smf.ols(\n", " formula=formula, \n", " data=df_clean).fit(cov_type='cluster', cov_kwds={'groups': df_clean['id_panel_cluster']})\n", "\n", "print(model.summary())" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.7" } }, "nbformat": 4, "nbformat_minor": 5 }