Products & Solutions / By Solution / Forecaster / FAQ


FAQ


General

Where can I get additional information about neural networks?
How could I improve things to get better forecasting?
When neural networks are a bad choice for my forecasting?

Data Analysis and Preprocessing

How much historical data do I need?
Why some columns are grayed after Data Analysis and cannot be selected as targets?
What is a categorical column?
How can I see which records and columns were removed from analysis?
What is your algorithm of removing misplaced data?

Network Preparation

What is network training?
What is the best training algorithm for my problem?
Why the absolute error became disabled during the Network Preparation step?
What is “minimum improvement in error”?
How could I speed-up network selection?
How much hidden layers and units do I need?
How much time is required for training?
Can I change the network parameters after training?

Forecasting and Reporting


How could I forecast several values at once without entering them manually?
How could I change report format?

 


General

Where can I get additional information about neural networks?
There is a good introductory book written by Kevin Gurney and available online at: http://www.shef.ac.uk/psychology/gurney/notes/index.html

You can also try Dr. Leslie Smith’s brief online introduction to neural networks packed with pictures and examples at: http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html.

A good introductory book for managers and business analysts is:
Bigus, J.P. (1996), Data Mining with Neural Networks: Solving Business Problems--from Application Development to Decision Support, NY: McGraw-Hill.

For engineers and technically-minded people we’d recommend to start with: Fausett, L. (1994), Fundamentals of Neural Networks: Architectures, Algorithms, and Applications, Englewood Cliffs, NJ: Prentice Hall.

For financial specialists, bankers and traders we recommend starting with: E. Michael Azoff (1994). Neural Network Time Series: Forecasting of Financial Markets NY: John Wiley and Sons, Inc.

How could I improve things to get better forecasting?
You have two ways to improve results:
1) improve you input data (for more information please read Preparing Data Sets section in Advanced Issues chapter)
2) improve network topology selection and network training (for more information please read Selecting Network Topology and Training Network sections in Advanced Issues chapter).

When neural networks are a bad choice for my forecasting?

Neural networks cannot create or digest the information that is not contained in your data. To properly train a neural network you need to have a lot of data. You data should contain input parameters (signals, attributes, correlated values) that affect the target value. Change of input parameters should lead to change of target one.
So, if you have small amount of historical data or if you do not know, which parameters influence your target value, better use some other forecasting method.
In addition, there exist some problems that in principle cannot be solved by neural networks. Do not use neural networks (as well as other numerical methods) for problems like:

  • predicting random or pseudo-random numbers, like lottery numbers
  • forecasting cash flow, volumes of sales, etc. if your business isn’t stable and your market situation often changes dramatically.
  • any problem where historical data have no use due to unbiased, rapid and significant changes in the problem environment.

top
 


Data Analysis and Preprocessing


How much historical data do I need?
You definitely need to have more records in the training subset than the total number of input columns.
The number of records needed for training depends on the complexity of your problem and amount of noise in your data. There are no exact rules. Typically, it’s recommended to have at least 10 times as many records for training as input columns.
This may not be enough for problems with subtle and complex dependencies in data. Try to add more data if your network has poor results.

Why some columns are grayed after Data Analysis and cannot be selected as targets?
The grayed columns cannot be converted for the use with neural networks. These are typically text columns, data/time columns, or columns that have a lot of misplaced or missing data.
You may control the process of column accepting/ignoring in Expert Mode:

  • The handling of missing and misplaced values can be specified at Data Analysis step.
  • columns identified as containing text can be considered as categorical by ordering Alyuda Forecaster to accept them.
  • The date/time columns may be used only after specifying the required periodicity of their encoding.

What is a categorical column?
Each value of a categorical column represents a certain category. For example, categorical is a column that contains only “Male” or “Female” as its values. Typically, the number of different values in a categorical column is much less than the number of records.
Categorical data should be encoded in a special way to be suitable for a neural network.
You may manually mark a column as categorical in Expert Mode (using Details button at Data Analysis Progress step). This feature may be beneficial for some cases. For example, your data has a column “Model” that has values “1”, “2”, “3”. By default, this column will be considered as a numeric, but it will be more beneficial to encode it as a categorical one.

How can I see which records and columns were removed from analysis?

During the Data Analysis step click the “Details” button and you will see your data with grayed columns and rows. All colored cells will be removed from further use. In the Details window you may also see a reason of removing a record. The cells containing missing, misplaced data or outliers are painted with different colors. You can control this process in Expert Mode. In this mode you can set your preferences for data analysis.

What is your algorithm of removing misplaced data?

If all data in one of your columns contain numbers with the exception of several values, Wizard will identify this column as numeric. These several values will be identified as misplaced and records containing them will be removed. The same is true for other types of columns.
The main question is this algorithm is “How many these “several” can be?” If you suspect that your data may have misplaced values, you need to give the Wizard a clue of how much misplaced values can be in your noisiest column. You can do it during Data Analysis step in Expert Mode.
There is no misplaced data handling in Standard Mode. All columns are considered to be free of misplaced values, and if a numeric column contains at least one text value, it will be considered a text one.

top
 


Network Preparation

What is network training?
Network training means adjusting neural network weights. During training the network analyzes the data you have provided and changes weights between network units to reflect dependencies found in your data.

What is the best training algorithm for my problem?

If your data have up to 10 input columns, the best training algorithm will be Levenberg-Marquardt. It is fast and quite reliable.
If you have a data set with hundreds of thousands of records and more, we recommend trying Incremental Back Propagation first.
For all other cases it fully depends on your type of problem and dependencies inside your data. We recommend to start with Conjugate Gradient Descent and then try Quick Propagation and as the last step Batch Back Propagation or Incremental Back Propagation.

Why the absolute error became disabled during the Network Preparation step?

When your target column is not numeric, it is hard to define unambiguously what the absolute error is. For such cases it is better to use only relative errors, which is enough to completely control the training process.
In Expert Mode you may use CCR (Correct Classification Rate) instead of error threshold definition.

What is “minimum improvement in error”?

Minimum improvement in error specifies the minimum error change during each iteration (or during several last iterations). This parameter is useful for detection of situations where the network cannot further improve its performance and training should be stopped to save time.
Although one should be careful with this parameter because in certain cases the error can be decreased after a lot of “motionless” iterations. It’s impossible to automatically detect such cases. We recommend to set 10 iterations, which is enough for most of of problems. For certainty you can set up to 100 iterations.
How much time is required for network selection?
The time required for network selection depends on the number of inputs, amount of data, complexity of the task and capability of your computer. The network selection can last from several seconds to several hours.

How could I speed-up network selection?

The first way is to select the “Rough search” method, which is the quickest one but does not guarantee the best results.
The second way is to specify the minimum and maximum number of hidden units your problem may require (Expert Mode only). This way requires some experience in neural networks and at least approximate estimation of problem complexity.

How much hidden layers and units do I need?

In our experience, the majority of problems (ca. 80%) have a good solution with 1 hidden layer, another part (ca. 20%) has a good solution with 2 layers, and only 1-2% of problems need 3 layers or more. More than two hidden layers are typically beneficial only for special problems, such as ZIP code recognition.

If you have a small number of hidden units you will get a big error during forecasting, because there is not enough power to find and encode dependencies of your data. If you have a big number of hidden units neural network tends to memorize your data rather than encode dependencies and this will also lead to a big error during forecasting.

For majority of problems, there is only one way to find the best number of hidden units: train several networks with different number of hidden units and find the best network by comparing forecasting errors on testing subset.
Alyuda Forecaster uses several proprietary algorithms of searching for the best number of hidden units. These algorithms, in out point of view, strike the best balance between the need to reduce the search time and to find the best variant.
To search among all variants you may start exhaustive search, but be prepared to wait a long time.

How much time is required for training?

The time required for network training depends on the number of inputs, number of hidden units, amount of data, complexity of the task and capability of your computer. Complete network training can continue from several seconds to several hours.

Can I change the network parameters after training?

Yes, you can press the “Back” button and change network parameters, but you will need to train your network again. The previous network will be lost unless you saved it in a file.

top

 


Forecasting and Reporting

How could I forecast several values at once without entering them manually?
Alyuda Forecaster doesn’t have this feature.

How could I change report format?

During the Reporting step press the “Show Report” button. You will see report preview. Click “Save As…” in the “File” menu and select desired format in the “Save as type” dropdown list.

top