Data: AI’s Kryptonite in Asset Management?

January 06, 2022 EST

Price, percentages, basis points[1], revenue, sales numbers, operating costs all with their corresponding metrics. Turn on the financial news any given day and you will see this. One would think that this is a great place for AI to make its mark. Yet, many who have tried have not achieved the results one would hope for. So, what is the challenge AI has with financial markets? Surprising to most, data may be the biggest foe.

The data alone is not the issue, it is what happens when the data is not sufficient, overfitting[2]. Overfitting is a modeling error in statistics that occurs when a function is too closely aligned to a limited set of data points. You might think of a time when you felt like you knew it all. No one could tell you anything new and, in a certain setting, you were an expert. Then, when thrust into a new situation where all your “knowledge” was useless. This is what happens to a model when it is overfit, it becomes too focused on the data it learned from that it performs badly when introduced to new data. This is a huge problem that must be constantly addressed by those using AI in any field. The source of overfitting is many times the data. Let’s dive into some of the key issues with financial data in AI modeling.

Lack of Data Points

  • Time Series[3]: Financial data is in time series form, and it is impossible to go back in time and create different stock prices for each day. This means that it is hard to create more data to keep the model from overfitting. Additionally, another issue is the difference in time that each company has been listed in the market. Microsoft, for example, went public in 1986 while Tesla went public in 2010.

Meaning of Units

  • Changes to Metrics Used in Valuation: While many valuation metrics[4] have stood the test of time, the metrics we use today may not be the same used 10 or 20 years ago. When trying to train an AI to learn patterns, the model needs to consider this so that it is making decisions with the same information that a human had at the time of the decision.

Noisy Data

  • Human Error: Numbers on a financial statement or GDP data are sometimes updated after the quarter ends. While this is easy to see for a human, a computer doesn’t know the difference. If not dealt with, it could cause serious problems, such as look-ahead bias, for the model when given live data.

Kirin API

                To help solve the issues with data, Qraft created Kirin API, Qraft’s proprietary data preprocessing system. Kirin uses a variety of techniques to help prepare the data to train and test our deep learning models[5].

                As mentioned before, one of the issues with time series data is that it limits the number of data points possible for use. One technique Qraft uses in an attempt to solve this issue is Data Augmentation[6]. This process involves making slightly modified copies of already existing data to increase the overall data amount.

                While many processes sound technical and have fancy names, to solve the issue of changing metrics or data corrections, you just have to change it. Kirin API makes sure that when the data enters the model, it has the data that was available to investors on that date, and not the corrected version.

                To help cut out the noise in the data, Qraft uses feature extraction[7]. Many times, in data sets, different features (Different types of data such as stock price, sales, or GDP data) have the same correlation. These features are condensed to one feature to reduce the noise in the data and make the computing process more efficient.

                As AI is a new concept to many, many investors could assume that data is not a problem. It is also important to note that the information above is not a comprehensive list nor are the solutions techniques used the only ones Qraft used. However, with the potential for more AI-powered investment products to come on the market, we hope this will arm investors with the knowledge to ask the right questions.


1. Basis Points - Basis points (BPS) refers to a common unit of measure for interest rates and other percentages in finance. One basis point is equal to 1/100th of 1%, or 0.01%, or 0.0001, and is used to denote the percentage change in a financial instrument.

2. Overfitting - Overfitting is a modeling error in statistics that occurs when a function is too closely aligned to a limited set of data points.

3. Time Series Data - Time series data is a collection of observations obtained through repeated measurements over time. Plot the points on a graph, and one of your axes would always be time.

4. Valuation Metrics – Valuation metrics are comprehensive measures of a company's performance, financial health and prospects for future earnings. Examples include EPS (Earning per Share), P/E (Price to Equity), etc.

5. Deep Learning Model - Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks.

6. Data Augmentation - Data augmentation is a popular technique used to increase the generalizability of a possibly overfitting model. By generating additional training data and exposing the model to different versions of data within the same class, the training process becomes more robust and is thus more likely to generalize the resulting model.

7. Feature Extraction - Feature extraction is a process of dimensionality reduction by which an initial set of raw data is reduced to more manageable groups for processing.

Investors should consider the investment objectives, risks, charges and expenses carefully before investing. For a prospectus or summary prospectus with this and other information about the Fund, please call 1-855-973-7880 or visit our website at Read the prospectus or summary prospectus carefully before investing.

The Funds are distributed by Foreside Fund Services, LLC

Investing involves risk, including loss of principal. The Funds are subject to numerous risks including but not limited to: Equity Risk, Sector Risk, Large Cap Risk, Management Risk, and Trading Risk. The Funds rely heavily on a proprietary artificial intelligence selection model as well as data and information supplied by third parties that are utilized by such model. To the extent the model does not perform as designed or as intended, the Fund’s strategy may not be successfully implemented and the Funds may lose value. Additionally, the funds are non-diversified, which means that they may invest more of their assets in the securities of a single issuer or a smaller number of issuers than if they were a diversified fund. As a result, each Fund may be more exposed to the risks associated with and developments affecting an individual issuer or a smaller number of issuers than a fund that invests more widely. A new or smaller fund's performance may not represent how the fund is expected to or may perform in the long term if and when it becomes larger and has fully implemented its investment strategies. Read the prospectus for additional details regarding risks.

While it is anticipated that the Adviser will purchase and sell securities based on recommendations by the U.S. Large Cap Database, the Adviser has full discretion over investment decisions for the Fund. Therefore, the Adviser has full decisionmaking power not only if it identifies a potential technical issue or error with the U.S. Large Cap Database, but also if it believes that the recommended portfolio does not further the Fund’s investment objective or fails to take into account company events such as corporate actions, mergers and spin-offs.

QRAFT AI-Enhanced U.S. Large Cap ETF: Companies in the health care sector are subject to extensive government regulation and their profitability can be significantly affected by restrictions on government reimbursement for medical expenses, rising costs of medical products and services, pricing pressure (including price discounting), limited product lines and an increased emphasis on the delivery of health care through outpatient services.

QRAFT AI-Enhanced U.S. Large Cap Momentum ETF: The Fund is subject to the risk that market or economic factors impacting technology companies and companies that rely heavily on technology advances could have a major effect on the value of the Fund’s investments. The value of stocks of technology companies and companies that rely heavily on technology is particularly vulnerable to rapid changes in technology product cycles, rapid product obsolescence, the loss of patent, copyright and trademark protections, government regulation and competition, both domestically and internationally, including competition from foreign competitors with lower production costs. Technology companies and companies that rely heavily on technology, especially those of smaller, less-seasoned companies, tend to be more volatile than the overall market.

QRAFT AI-Enhanced US High Dividend ETF: Securities that pay dividends, as a group, may be out of favor with the market and underperform the overall equity market or stocks of companies that do not pay dividends. In addition, changes in the dividend policies of the companies held by the Fund or the capital resources available for such company’s dividend payments may adversely affect the Fund. In the event a company reduces or eliminates its dividend, the Fund may not only lose the dividend payout but the stock price of the company may also fall.

QRAFT AI-Enhanced U.S. Next Value ETF: The value approach to investing involves the risk that stocks may remain undervalued, undervaluation may become more severe, or perceived undervaluation may actually represent intrinsic value. Value stocks may underperform the overall equity market while the market concentrates on growth stocks. The small- and mid-capitalization companies in which the Fund invests may be more vulnerable to adverse business or economic evens than larger, more established companies, and may underperform other segments of the market or the equity market as a whole. Securities of small- and mid-capitalization companies generally trade in lower volumes, are often more vulnerable to market volatility, and are subject to greater and more unpredictable price changes than larger capitalization stocks or the stock market as a whole.

Alpha – Alpha is a measure of the active return on an investment, the performance of that investment compared with a suitable market index.

AutoML – Short for Automated Machine Learning, AutoML is the automation of the machine learning process to make machine learning jobs simpler, easier, and faster.

Kirin API - Developed by Qraft’s data scientists, integrates multiple vendors to provide both macroeconomic and company fundamentals with the correct point-in-time data.