forked from MichaelFerrier/HTMCLA
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathreadme.txt
415 lines (334 loc) · 24.4 KB
/
readme.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
===========================================================================================
HTMCLA
===========================================================================================
A C++ implementation of Numenta's Hierarchical Temporal Memory Cortical Learning Algorithm
By Michael Ferrier, 2013.
===========================================================================================
This C++ implementation is based on Numenta's CLA white paper:
https://www.groksolutions.com/htm-overview/education/HTM_CorticalLearningAlgorithms.pdf
It builds on the OpenHTM C# implementation:
https://sourceforge.net/p/openhtm/
A couple of demonstration videos are online here:
http://www.youtube.com/watch?v=IXg_XIm5kqk
http://www.youtube.com/watch?v=YeBC9eew3Lg
===========================================================================================
Build Instructions
===========================================================================================
HTMCLA depends on the Qt user interface library. The latest version can be downloaded here:
http://qt-project.org/downloads
To get Qt working and building with Visual Studio 2012 on Windows 7 64-bit, I had to follow
these steps:
1) Install QT from http://qt-project.org/downloads. Use "Qt 5.0.2 for Windows 64-bit
(VS 2012, 500 MB)".
2) Install QT's Visual Studio add in, from http://qt-project.org/downloads. Use "Visual
Studio Add-in 1.2.1 for Qt5".
3) Open VS, open QT5 menu, go to QT Options. Add the msvc2012_64 directory under the QT
installation (the directory with 'bin' in it).
4) In VS, under Build >> Configuration Manager, make sure Platform is set to x64.
5) Under Project >> Properties >> Configuration Properties >> Linker >> Advanced, make sure
Target Machine is also set to X64.
6) Under QT5 >> QT Project Settings, set Version to msvc2012_64.
7) Create and attempt to build project. May find that you need to rename lib files being used
(in the QT install) to have a '5' before the dot at the end of their filenames.
I can't speak for how Qt 5 should be configured on other setups, but I find there's a lot of
help on the web.
Once you have Qt5 integrated into your environment, you should be able to create a project in
an environment of your choice that incorporates QT 5 and each of this project's .cpp and .h
files. If you're using Visual Studio 2012, you should be able to use the htm.sln file.
===========================================================================================
Using HTMCLA
===========================================================================================
File Menu
=========
Load Network: Load a .xml file specifying the architecture of a CLA network. This file would
be created in a text editor. The HTMCLA package comes with several test network files in the
"data" subdirectory.
Load Data: You can load the segment and synapse data into a network, so that you can load up
the state of that network after training. This is especially useful for complex learning
tasks where training takes a long time. You can only load a data file into the same network
from which it was saved (although the network parameters may be diffrent from what they were
when the data was saved). Data is saved in a .clad file (CLA data).
Save Data: After training a network, you can save out its segment and synapse data so that
you can load it directly into the same network on a later run, without having to redo the
training. It is saved out as a .clad (CLA data) file.
Exit: Exit the program.
View Menu
=========
Update while running: If this is checked, the UI will be updated after each time step when
running the network. This slows down execution significantly, but may be useful to watch
progress in some cases.
Mouse Menu
==========
Toggles mouse function between "Select" and "Drag". In select mode, whatever is under the
mouse is selected when the left button is clicked. When in drag mode, the view can be
repositioned by being dragged. This function remains buggy.
Network Panel
=============
Displays the filename of the loaded network.
The || (Pause), > (Single Step), and >> (Continuous Run) buttons are used to control
execution of the network.
The current time step is displayed, as well as the "Stop at:" field. If a number is entered
here, continuous execution will cease at that given time step.
Selected Panel
==============
By left clicking on a cell or column in one of the views, that cell or column is selected.
Information about the selected cell, column and region is displayed in this panel. If a cell
is selected and if it has distal segments, a list of those segments is displayed at the
bottom. If the views are displaying connections (synapses), then all of the selected column's
proximal synapses will be shown, and all of the selected cell's distal synapses will be
shown. To display just the distal synapses on one segment, select that segment by clicking
on its line in the list. To deselect the segment and return to displaying all synapses on
all distal segments, click the "Display all segments" button.
The table of segments is currently quite slow to update, so displaying the segments on a cell
with many distal segments can cause the program to pause for a few seconds.
The Views
=========
On the left side of the UI are two views of the network. Each one can display one region or
input space. If you'd like, they can both display the same region or input space. To select
what it is that a view displays, select the name of the region or input space from the view's
"Show:" menu.
A view's "Options" menu contains several options for what information is displayed in the
view:
View Activity: Displays in the view what cells are active (orange), and what cells are
predicted (yellow for one-step prediction, pink for >1 step prediction).
View Top-Down Reconstruction: Displays a reconstruction of what is active in the region or
input space, based on what is active in the region(s) that it outputs to. Synapses are
traced backwards from active cells in the output region(s), to determine what cells in
this region (or input space) would lead to those activations.
View Prediction: Displays a reconstruction of what is predicted to be active in the region or
input space during the next time step, based on what cells are 1-step predictive in the
region(s) that it outputs to. Synapses are traced backwards from 1-step predictive cells in
the output region(s), to determine what cells in this region (or input space) would lead to
those activations.
View Boost: When this is on, each column is colored to show its relative boost value. The
brightest red columns have the highest boost values, while grey columns are not boosted at
all.
View Connection In: When this is on, synapses from cells that input to the selected cell
and/or column are displayed in the view. Red synapses are not connected; green synapses are
connected. The brighter the color, the further the synapse's value from ConnectedPerm (in
either direction). Zoom in on the area to view the permanence values of the syanpses. A
selected column's proximal synapses are shown in it's region's input region(s) or input
space(s). A cell's distal synapses are shown in the cell's same region.
View Connections In: This feature is not yet implemented. Eventually it will display the
synapses that output from the selected cell to other cells.
View Marked Cells: Cells can be marked so as to, for example, compare what cells are active
at one time step with what cells are active at a later time step. If this option is
selected, then a black dot will be shown over each marked cell.
Mark Active and Predicted Cells: All cells that are currently active or in predictive state
in the view are marked. Note that the marks will only show if "View Marked Cells" is
selected.
Mark Predicted Cells: All cells that are in predictive state are marked.
Mark Learning Cells: All cells that are learning cells (a subset of active cells) are marked.
The magnifying class icons and the slider between them are used to zoom in and out of the
view. At closer zoom levels, more informaton is displayed, such as the coordinates of
columns, the permanence values of synapses, and an "L" over learning cells.
The scrollbars may be used to reposition what part of the view is visible.
The divider between the two views can be slid from side to side to changes the sizes of the
views. The divider can also be slid all the way to the outer edge, to hide a view. It can
then be grabbed back from the outer edge to re-show that view.
===========================================================================================
Network Files
===========================================================================================
Network architecture is specified in a .xml file that can be created in a text editor. HTMCLA
comes with several example network files, located in the "data" subdirectory. The network
file specifies all parameters for each region and input space in the network, as well as any
input patterns that an input space will use.
The network file format is based loosely on the OpenHTM network file format, with a number
of changes. Most of the features of a network file are self explanatory by looking at the
examples. Here are a few notes:
The synapse parameters can be specified separately for proximal and distal synapses, using
the <ProximalSynapseParams> and <DistalSynapseParams> blocks. When used outside of any
Region block (ie. when not nested), these parameters are used as the default for all
proximal or distal synapses throughout all Regions. In addition to specifying network-wide
default synapse parameters, you can specify synapse parameters specific to a Region by using
the <ProximalSynapseParams> and <DistalSynapseParams> blocks nested within the <Region>
block.
A network is composed of <InputSpace> and <Region> blocks. InputSpaces provide input from the
"outside" of the network, while Regions are where the CLA learning takes place. Each Region
must have an <Inputs> block, which specifies one or more inputs to that Region, as well as
the input radius for that input. A Region may have any number of inputs, and each one may be
either an InputSpace or another Region.
The <Boost> tag for a Region is used to specify the boost rate as well as the max allowed
boost value.
The BoostingPeriod, SpatialLearningPeriod and TemporalLearningPeriod tags are used to specify
in what range of time steps each of these types of learning is allowed to take place within
for the Region. Typically boosting and spatial learning are ended before temporal learning
begins for a Region, so that the temporal pooler will be working with stable spatial
representations. However this is unnecessary if the learning rates are slow enough that the
temporal pooler has a chance to adapt to changes in the spatial representations. The value -1
can be used for any of these, to mean "start from the beginning" or "never stop".
MinOverlapToReuseSegment has a "min" and a "max" value. A value within this range is chosen
randomly for each column within the region. This allows columns to have some variation in how
predisposed they are to re-using the same segment (and so the same learning cell) for a
somewhat different context, versus choosing a different segment (and so a different learning
cell). This way, some columns will be more context sensitive while others will tend to
generalize over different contexts, providing a mix of the strengths of both approaches.
===========================================================================================
Notes on Parameters
===========================================================================================
Through experimentation I discovered a few properties about the different parameters and how
they relate to one another:
NewNumberSynapses (aka newSynapsesCOunt in the white paper): This number should always be
lower than the number of active cells that a cell usually has in its LocalityRadius (ie.,
that are available to be incorporated into distal segments). That way a distal segment will
be adding synapses to a sample from the local activity, not adding *all* of the local
activity. This is how its function is described in the Numenta docs. I found that if
newSynapsesCount is greater than the number of active cells in localityRadius, then that
leaves blank "slots" to be filled in later with part of a different pattern sharing some of
the same columns, which causes lots of problems (as I described under problem #3 in the
temporal context forking thread on the OpenHTM forum).
MinOverlapToReuseSegment (aka minThreshold in the white paper): The lower this number, the
more likely that a similar pattern will be added to an existing segment rather than having a
new segment created for it. So long as NewNumberSynapses is less than the typical number of
locally active cells (as described above), this parameter doesn't need to be very high.
MinSynapsesPerSegmentThreshold should always be <= SegmentActivationThreshold (to avoid
problems where a cell isn't predicted or activated, but is chosen to be a learning cell).
I've been using values between 1 and 5. As described above, I've added an option to provide
a range for MinSynapsesPerSegmentThreshold, so that each column has a random value within
that range chosen for its own MinSynapsesPerSegmentThreshold. This results in some columns
being predisposed to choose the same cell to represent similar contexts, while others are
biased to pick a different cell. This way the closer two contexts are to one another, the
more cells the representations of those contexts will have in common, which should help with
the learning of generalization. Providing a range like that is not necessary to fix the
temporal forking problem however; using a higher value such as 5 results (in my tests) in
different cells being chosen for diffrent contexts, and all of the temporal forking tests
working correctly. Further note: This parameter is very important for the temporal context
forking problem. Too low and the same cells will be chosen for a given pattern in different
contexts. Too high and different cells will be chosen all the time as new context is
learned further and further back, resulting in reuse of some of the same cells as it loops
around (a short looping sequence such as ABBCBBA).
SegmentActivationThreshold (aka activationThreshold in the white paper): This number should
not be lower than MinSynapsesPerSegmentThreshold, as implied in the white paper description
for getBestMatchingSegment(): "The number of active synapses is allowed to be below
activationThreshold, but must be above minThreshold." In my tests I have it set to 5.
PermanenceIncrease and PermanenceDecrease: If setting your parameters to allow the same
segment to be used sometimes in similar contexts (by using a lower value for
MinSynapsesPerSegmentThreshold), then PermanenceInc and PermanenceDec (for distal synapses)
must be set so as to allow multiple patterns to be supported by a single segment. The higher
the ratio of PermanenceInc/PermanenceDec, the more patterns can be supported in a single
segment. I'm using PermanenceInc 0.01 and PermanenceDec 0.005.
===========================================================================================
Changes from the White Paper and the OpenHTM implementation
===========================================================================================
This C++ implementation of the CLA sticks pretty close to the Numenta white paper and to the
OpenHTM implementation. However there are a few major differences that should be reviewed.
Hierarchy
=========
In this implementation, an InputSpace provides input to the network and a Region performs
learning. The Region and InputSpace classes both inherit a common interface called DataSpace,
which a Region uses to query its inputs. So, a Region may have any number of inputs, each of
which can be either an InputSpace or another Region. Regions are processed in the order that
they are defined in the network file.
Hypercolumns
============
This implementation incorporates the idea of a cortical hypercolumn, which is a group of
columns (in the brain, about 100 columns) which all receive input from the same area of their
inputs, and which mutually inhibit among themselves. In area V1, one hypercolumn contains
columns that represent lines of all different orientations. The columns in a hypercolumn
mutually inhibit among themselves, which helps to cause the different columns to specialize
in learning to represent different features (mostly lines of different orientations) and
which also causes only one column to be active per hypercolumn at a time. There is a
diffrerent hypercolumn for each small patch of the visual field.
In this implementation, hypercolumns can be used by setting a region's HypercolumnDiameter.
This defaults to 1, in which case the region acts like an OpenHTM region -- each column is
its own hypercolumn. InhibitionRadius and PredictionRadius are specified in hypercolumns, so
if HypercolumnDiameter is > 1, it will affect how these values are used.
A Region has an <Inhibition> tag, with a "type" attribute. If type is set to "automatic",
inhibition will work the way it's described in the white paper and implemented in OpenHTM,
where an inhibition radius is determined based on the average receptive field size of
columns in the Region. If inhibition type is set to "radius", however, then a radius is
specified, which is measured in hypercolumns. If the given radius is 0, then only columns
within the same hypercolumn will mutually inhibit, as in V1.
Overlap
=======
In the CLA white paper and in the OpenHTM implementation, a column's Overlap value is simply
the count of its active, connected proximal synapses, multiplied by the column's Boost value
(a number >= 1). This is also the case in my implementation, but an additional factor is
multipled in. That factor is the ratio of the number of connected active synapses, over the
sum of itself plus the number of strongly connected inactive synapses. A strongly connected
synapse is defined as a synapse with permanence > InitialPerm.
The thinking behind this is that a column which exactly represents a particular pattern of
active inputs should be a better match for that pattern than another column that represents
the same active inputs plus additional active inputs. For example, say a number of inputs
are active such that they form the image of a 45 degree line. One column may have only
enough connected synapses to exactly represent that pattern, while another column may have
all of its synapses connected -- all of the ones that represent that line, plus every other
synapse representing every other input. In this case, in without the factor that I brought
in, both columns would have the same overlap (given that they have the same boost value).
That makes the columns that have most or all of their synapses connected "greedy", in the
sense that they can represent many smaller patterns that they don't fit very well. This
causes problems when trying to apply CLA to a vision application, and may cause problem in
non-visual applications as well. My solution isn't perfect, but to fully address this
problem would mean making major changes to how the spatial pooler works.
Boosting
========
Boosting is very important for its use in "separating" different patterns so that they share
a minimum of active columns. This is necessary in order for the temporal pooler to be able to
differentiate between different spatial paterns, and therefore to be able to make clear
predictions.
I was unable to get useful results out of the boosting algorithm given in the CLA white paper
or in the OpenHTM implementation. Boosting is completely implemented in Column's
PerformBoosting() method, so a look at the code and comments in that method will help explain
the changes I made. In short:
- If a column's ActiveDutyCycle is less than a small fraction of its MaxDutyCycle (the
maximum ActiveDutyCycle of all the columns in its inhibition radius), then a value (the
boost rate) is added to the column's Boost value. This is as described in the CLA.
- If a column's Boost rate is increased for the first non-consecutive time (ie. a new period
of boosting has begun for it), then each of its connected synapses have their permanence
decreased to ConnectedPerm. This makes it easier for the column to come to represent a
modified version of the pattern that it currently represents.
- A column's boost value is limited to the range from its MinBoost to MaxBoost. MinBoost is
a tiny random amount above 1.0, and MaxBoost is a tiny random amount below the Region's
given MaxBoost parameter value. The reason for each column having slightly different
MinBoost and MaxBoost values is so that multiple columns that have the same overlap and
have reached either MinBoost or MaxBoost, will end up having slightly different
(post-boosted) overlap values. This is to avoid ties between many columns that have the
same overlap value, which could cause many more columns than the specified
PercentageLocalActivity should allow, to be active at one time.
- If a column would be boosted higher, but has already reached MaxBoost, then its
un-connected synapse permanence values are incremented by the boost rate, but they are
incremented no higher than ConnectedPerm. If a column has reached MaxBoost and still needs
boosting because it hasn't been active, it has no hope of becoming active and so it needs
to change to represent a different pattern. Increasing the synapse permanence values
gradually gives it an increasing opportunity to represent a different pattern.
Segment Updates
===============
UpdateSegmentActiveSynapses() has been changed to fix a few problems with the original CLA
whitepaper version.
- When short repeating loop patterns, such as AAAXAAAX... are given, the CLA quickly learns
the full pattern and every cell that makes up part of every representation is in
predictive state (1-step or higher) at all times. Because of this, no cell that's part of
any representation ever becomes truly inactive, and so no segment updates can ever be
negatively reinforced. To solve this, I made it so that if a cell goes from 1-step
predictive state to >1-step predictive state, then it negatively reinforces any segment
updates that are proven "incorrect" by this change in predictive state.
- If a cell is put in predictive state at the same time that it is activated, then in the
white paper version and the OpenHTM version, the segment that put it into predictive state
would be positively reinforced by its simultaneous activation. This should not be the case
however; a prediction implies that the cell will activate in the future, not at the same
time. (patterns of simultaneous activation are the spatial pooler's job).
UpdateSegmentActiveSynapses() has been modified so that it will not process such a segment
update in this case; the segment update will stay in the queue until a later time step,
when it can be determined if the prediction (of *future* activation) came true or not.
Network File Format
===================
As described in th section "Network Files", several new features have been added and made
available in the network file format, such as synapse parameters that can be set separately
for distal and proximal syanpses, and optionally on a per-region basis, and the ability to
specify periods for boosting, spatial learning and temporal learning on a per-region basis.
InputSpaces can specify a variety of different types of Patterns, such as text or bitmaps.
===========================================================================================
Getting Started
===========================================================================================
After getting HTMCLA built and running, I recommend testing the various example network files
that can be found in the "data" subdirectory. Open each one in a text editor, and you will
find comments at the top of each file with a brief description and instructions. Most of them
are specialized toward a particular example. net_test.xml includes many different examples,
which can be tested by un-commenting out just the pattern that you want to test.
===========================================================================================
Legal Information
===========================================================================================
For license information for Numenta's HTM CLA technology, and for Digia's Qt library,
see these files in this directory:
License_Numenta.txt
License_Digia.txt