In the first lecture we talked about the metaphor of explaining investment returns using factors. The return on assets are made up of return on the various factor, like food is made up of nutrient. Our previous classes covered how to analyze precisely which factor contribute how much to the investment return. But you can only achieve investment returns from factors when you take investment risk that comes with factors. You can do factor analysis using the Fama-French Five-Factor model but you can also do the analysis using other factors, maybe macroeconomic factors. Macroeconomic factors are very commonly considered factors. We usually classify factor model into three categories : Fundamental factor model like the Fama-French factor model, macro factor model, and statistical factor models. Today, we'll discuss the macro factor model and the common problems you will face when dealing with financial data. You are going to use the macro factor model in your project. There are many references to macro factor models, but I'll refer to the model based on content mentioned on BlackRock's website. BlackRock is one of the well-known investment companies for factor investing. According to many studies, a significant portion of asset returns can be explained using roughly a few macroeconomic factors. Economic growth, real rate, and inflation are considered to be the three most important factors. Credit-related factors or liquidity are also critical factors. Furthermore, the overall economic status of emerging market is now considered an important factor as well. Especially factors other than economic growth, real rate, and inflation become much more descriptive when the overall economic or financial situation is in difficulty situation. When describing the SS return using macroeconomic factors, the SS return are regarded as a reward for taking risk to the uncertainties of these factors. Now, let us investigate the factors one by one and try to understand better. The most crucial factor would be economic growth. Of course, we should invest when we expect the growth. However, growth also has uncertainty. Take a look at GDP, for example, when considering economic growth. Of course, GDP is essential but it does not go directly into our model as factor. Instead, many investors create a slightly modified factor that contains information about the growth. Let me give you an example. The most commonly used one is the difference between the current growth expectations and the actual growth. In other words, we expected GDP to be two percent in the first quarter but the announced GDP was 2.2 percent. The difference between these two would be a factor explaining the return on asset. The difference between this expected and actual outcome is called surprise or macroeconomic surprise. Real interest rate have information about the current central bank policy. When nominal interest rate rise, the price of a bond with nominal interest rate fall. Bond investors on return, end up taking the risk of this long-term movement of interest rate. In general, assets sensitive to real interest rate see a fall in returns when the real interest rate rise. If we think inflation will be too high, the cash from the asset that are already set to be received in the future will be less valuable, therefore, investor would not prefer this asset if they expect inflation to rise, and as a result, the return on the asset will fall. Now, let us talk briefly about others fee. Credit measures how many corporate bond will be in default. Default means companies that issued bond are unable to pay interest. Then how do we measure the investor's expectation in default? If investors think that default probability is high when investing in corporate bonds, they will demand more interest than safe bond. Therefore there is a gap between the interest rate of higher-risk, lower-rated bond and lower-risk, higher-rated bond. You can explain assets risk using this. What does it mean to have low liquidity? It means that you can't sell it easily when you want. Because of this risk, investors investing in less liquid asset demand higher return. The original reward for taking liquidity risk. Generally speaking larger-sized companies' stock and bond are more liquid than small companies. When two different asset liquidity change, it affect assets returns. Lastly, we can leave out the emerging market when we think of the global market. You can apply the same logic. Investor will receive a more return by taking risk from emerging countries comparing to the developed countries. There can be more risk when investing in emerging market such as political instability and exchange rate risk. For this reason, asset in emerging countries are highly volatile and the price of asset can suddenly drop dramatically, return on risk. Therefore, the difference in assets' return in the emerging market and the developed market can become a factor. Now, you can create a model that describe the return of an asset by using multiple regression method with multiple macro factors, just like we did with the Fama-French factor model. Depending on the type of asset, the degree of response to each macro factor will be different. In regression terms, the value of the coefficient of multiple regression and each p-value will vary. You can try to check it yourself later. The stock index is mainly affected by long-term economic growth, inflation, and interest rate exposure. Of course, government bond are affected by inflation and interest rates, and corporate bond are affected by credit and inflation and interest rate exposure. Now, I'll download the data, you can use this data when you do the project later. Today, I'll use what you learned so far, again. I'll use the tidyverse, quantamod, and lubridate package. In order to use macroeconomic data as factors, we download FRED, Federal Reserve Economic Data from the St. Louis Fed website. For this purpose, I stored nine tickers in a variable called macro. The nine tickers are tickers used in the FRED site, those are, GDP, CPI, three-month treasury t bill, ten-year treasury note, AAA bond rate, BAA bond rate, unemployment rate, industrial production, crude oil price respectively. Let's now download these data using quantamod getSymbols. Using for loop, you can download the data corresponding to each ticker in the same way you did from the last classes. Use getSymbol, similarly add when downloading the data from [inaudible]. But this time you need to use src equal FRED to specify the source of the data and name it data as.data. Make the date column by using as.POSIXlt function to the low names. It accurately recognize the date and save it as a column called date. As.POSIXlt is a function that comes with R. After using date as an input, the time zone can be specified using tz equal and the format of the input can also be specified using format. In the case of argument format, it can be designated in various ways using percent and kept away, m and d depending on the year, months, and date format. We give empty value to row names and macro_value to the first column name. Remember, date column will be the second column since we created it later. Use as.yearqtr function to display the year and quarter in which each date belongs. I also made a ticker column. It is a dataframe to bring in different macro information into the data and make it read over. The data we are about to download now has different frequencies in daily, monthly, and quarterly data, so we need to unify these datas frequencies to create data for the model. Since the longest frequency of data we have now is quarterly, we unify them into quarterly data. We use the ymd function to the date column to recognize it as the tidyverse package format. We use the group_by to create a group by year and quarter below. We now apply the top_n function to the group-specific data. The top_n function now export the largest value in the group the date column. One in the top_n function indicate the one largest value. Since we have grouped the data into quarter, the top_n function will pick the last day of quarters among date. The original quarter data will select the original data but monthly data will be selected at the last month of the quarter and the daily data will be selected at the last day of the quarter. We use the function filter to pick out the data from 1980 onward. We exclude the column date, we just use column quarter. I downloaded all material for creating macro factors and binded a newly made data into the variable macro_factors. When you have data with different frequencies, the simplest and most common way is to match the data using the longest frequencies. However, we sometimes match the data to the shortest frequency, which means we have to create data to match the shorter frequency. For instance, if you decide to match with the daily data, you must create some data between quarterly data. For this purpose, we also use modeling. The critical point here is that we have matched the frequency because we create data for the same model, so we also need to match the frequency of the factor data with assets' return data. I will download and use S&P 500 data. Once we have downloaded the monthly data using getSymbols function, we'll use adjust the price to calculate quarterly return using the quarterly return function. We save this data as S&P 500_returns and data as.data.frame structure. The row names of S&P 500_returns is the first day of each quarter. I saved this by creating a column named date. Also, I created a ticker column and save a ticker GSPC. Now, the date is recognized using their ymd function for the tidyverse package in the same way as before. Then use the as.yearqtr function to store the information of the year and quarter in a column called quarter. Now, our name of the date that I donate by using the function select. Now, we have quarterly return data for S&P 500. Let's tie up the downloaded data. Using the left_join we learned the last time. Macro_factors and S&P 500_returns are combined by a column called quarter. Use the spread function to create column with macro_ticker as the key and macro_value as the value. Last time, the factor data what in each row by date. On the other hand, this time, S&P 500 is in each row by date, so we created date new columns of macro data. You can see that the quarterly data it were made. Let's make a regression model using the macro factor we created. We use the lm function, but this time, we used the function only once because there was only one return on the S&P 500 index under the name quarterly.returns. Last time, I used a function lapply because there were many stacks. We gave it a condition na.action and specified it as na.exclude so that the model does not include any data that may not exist. Let's take a look at the result table. I won't interpret today's model because today's lecture focused on downloading macroeconomic data and what to do with different frequency data. It takes some tricks and effort to create factors using macroeconomic data and make a meaningful model. I'll leave this part as your project. Please, make your best effort to create a good model using all the technique we learned through the first five lectures. I'll add reference and hint for your project in the section describing the project. We look forward to seeing your successful project.