Archive-name: ai-faq/neural-nets/part7 Last-modified: 1998-01-30 URL: ftp://ftp.sas.com/pub/neural/FAQ7.html Maintainer: saswss@unx.sas.com (Warren S. Sarle)
This is part 7 (of 7) of a monthly posting to the Usenet newsgroup comp.ai.neural-nets. See the part 1 of this posting for full information what it is all about.
------------------------------------------------------------------------
The reactive tabu search algorithm has been implemented by the Italians, in Trento. ISA and VME and soon PCI boards are available. We tested the system with the IRIS and SATIMAGE data and it did better than most other chips.Overview articles:The Neuroclassifier is available from Holland still and is also the fastest nnw chip or a transient time less than 100 ns.
JPL is making another chip, ARL in WDC is making another, so there are a few things going on ...
More information on NN chips can be obtained from the Electronic Engineers Toolbox web page. Go to http://www.eg3.com/ebox.htm, type "neural" in the quick search box, click on "chip co's" and then on "search".
Further WWW pointers to NN Hardware:
http://msia02.msi.se/~lindsey/nnwAtm.html
Here is a short list of companies:
HNC Inc. 5930 Cornerstone Court West San Diego, CA 92121-3728 619-546-8877 Phone 619-452-6524 Fax HNC markets: Database Mining Workstation (DMW), a PC based system that builds models of relationships and patterns in data. AND The SIMD Numerical Array Processor (SNAP). It is an attached parallel array processor in a VME chassis with between 16 and 64 parallel floating point processors. It provides between 640 MFLOPS and 2.56 GFLOPS for neural network and signal processing applications. A Sun SPARCstation serves as the host. The SNAP won the IEEE 1993 Gordon Bell Prize for best price/performance for supercomputer class systems.
10260 Campus Point Drive MS 71, San Diego CA 92121 (619) 546 6148 Fax: (619) 546 6736
30 Skyline Drive Lake Mary FL 32746-6201 (407) 333-4379 MicroDevices makes MD1220 - 'Neural Bit Slice' Each of the products mentioned sofar have very different usages. Although this sounds similar to Intel's product, the architectures are not.
2250 Mission College Blvd Santa Clara, Ca 95052-8125 Attn ETANN, Mail Stop SC9-40 (408) 765-9235 Intel was making an experimental chip (which is no longer produced): 80170NW - Electrically trainable Analog Neural Network (ETANN) It has 64 'neurons' on it - almost fully internally connectted and the chip can be put in an hierarchial architecture to do 2 Billion interconnects per second. Support software by California Scientific Software 10141 Evening Star Dr #6 Grass Valley, CA 95945-9051 (916) 477-7481 Their product is called 'BrainMaker'.
Penn Center West Bldg IV Suite 227 Pittsburgh PA 15276 They only sell software/simulator but for many platforms.
7a Lavant Street Peterfield Hampshire GU32 2EL United Kingdom Tel: +44 730 60256
1400 NW Compton Drive Suite 340 Beaverton, OR 97006 U. S. A. Tel: 503-690-1236; FAX: 503-690-1249
P.O. Box 14 Marion, OH 43301-0014 Voice (614) 387-5074 Fax: (614) 382-4533 Internet: jwrogers@on-ramp.net InfoTech Software Engineering purchased the software and trademarks from NeuroDynamX, Inc. and, using the NeuroDynamX tradename, continues to publish the DynaMind, DynaMind Developer Pro and iDynaMind software packages.
* NRAM (Neural Retrieve Associative Memory) is available as a stand-alone chip or a functional unit which can be embedded inside another chip, e.g., a digital signal processor or SRAM. Data storage procedure is compatible with conventional memories, i.e., a single presentation of the data is sufficient. Set-up and hold times are comparable with existing devices of similar technology dimensions. Data retrieval capability is where NRAM excels. When addressed, this content addressable memory produces the one previously-stored pattern that matches the presented data sequence most closely. If no matching pattern is found, no data is returned. This set of error-correction and smart retrieval tasks are accomplished without comparators, processors, or other external logic. Number of data bits is adjustable. Optimized circuitry consumes little power. Many applications of NRAM exist in rapid search of large databases, template matching, and associative recall. * NRAM (neural retrieve associate memory) development environment includes PC card with on board NRAM chip and C++ source code to address the device. Contact: IC Tech, Inc. 2157 University Park Dr. Okemos, MI 48864 (517) 349-4544 (517) 349-2559 (FAX) http://www.ic-tech.com ictech@ic-tech.com
\subsection*{Digital} \subsubsection{Special Computers} {\bf AAP-2} Takumi Watanabe, Yoshi Sugiyama, Toshio Kondo, and Yoshihiro Kitamura. Neural network simulation on a massively parallel cellular array processor: AAP-2. In International Joint Conference on Neural Networks, 1989. {\bf ANNA} B.E.Boser, E.Sackinger, J.Bromley, Y.leChun, and L.D.Jackel.\\ Hardware Requirements for Neural Network Pattern Classifiers.\\ In {\it IEEE Micro}, 12(1), pages 32-40, February 1992. {\bf Analog Neural Computer} Paul Mueller et al. Design and performance of a prototype analog neural computer. In Neurocomputing, 4(6):311-323, 1992. {\bf APx -- Array Processor Accelerator}\\ F.Pazienti.\\ Neural networks simulation with array processors. In {\it Advanced Computer Technology, Reliable Systems and Applications; Proceedings of the 5th Annual Computer Conference}, pages 547-551. IEEE Comput. Soc. Press, May 1991. ISBN: 0-8186-2141-9. {\bf ASP -- Associative String Processor}\\ A.Krikelis.\\ A novel massively associative processing architecture for the implementation artificial neural networks.\\ In {\it 1991 International Conference on Acoustics, Speech and Signal Processing}, volume 2, pages 1057-1060. IEEE Comput. Soc. Press, May 1991. {\bf BSP400} Jan N.H. Heemskerk, Jacob M.J. Murre, Jaap Hoekstra, Leon H.J.G. Kemna, and Patrick T.W. Hudson. The bsp400: A modular neurocomputer assembled from 400 low-cost microprocessors. In International Conference on Artificial Neural Networks. Elsevier Science, 1991. {\bf BLAST}\\ J.G.Elias, M.D.Fisher, and C.M.Monemi.\\ A multiprocessor machine for large-scale neural network simulation. In {\it IJCNN91-Seattle: International Joint Conference on Neural Networks}, volume 1, pages 469-474. IEEE Comput. Soc. Press, July 1991. ISBN: 0-7883-0164-1. {\bf CNAPS Neurocomputer}\\ H.McCartor\\ Back Propagation Implementation on the Adaptive Solutions CNAPS Neurocomputer.\\ In {\it Advances in Neural Information Processing Systems}, 3, 1991. {\bf GENES~IV and MANTRA~I}\\ Paolo Ienne and Marc A. Viredaz\\ {GENES~IV}: A Bit-Serial Processing Element for a Multi-Model Neural-Network Accelerator\\ Journal of {VLSI} Signal Processing, volume 9, no. 3, pages 257--273, 1995. {\bf MA16 -- Neural Signal Processor} U.Ramacher, J.Beichter, and N.Bruls.\\ Architecture of a general-purpose neural signal processor.\\ In {\it IJCNN91-Seattle: International Joint Conference on Neural Networks}, volume 1, pages 443-446. IEEE Comput. Soc. Press, July 1991. ISBN: 0-7083-0164-1. {\bf Mindshape} Jan N.H. Heemskerk, Jacob M.J. Murre Arend Melissant, Mirko Pelgrom, and Patrick T.W. Hudson. Mindshape: a neurocomputer concept based on a fractal architecture. In International Conference on Artificial Neural Networks. Elsevier Science, 1992. {\bf mod 2} Michael L. Mumford, David K. Andes, and Lynn R. Kern. The mod 2 neurocomputer system design. In IEEE Transactions on Neural Networks, 3(3):423-433, 1992. {\bf NERV}\\ R.Hauser, H.Horner, R. Maenner, and M.Makhaniok.\\ Architectural Considerations for NERV - a General Purpose Neural Network Simulation System.\\ In {\it Workshop on Parallel Processing: Logic, Organization and Technology -- WOPPLOT 89}, pages 183-195. Springer Verlag, Mars 1989. ISBN: 3-5405-5027-5. {\bf NP -- Neural Processor}\\ D.A.Orrey, D.J.Myers, and J.M.Vincent.\\ A high performance digital processor for implementing large artificial neural networks.\\ In {\it Proceedings of of the IEEE 1991 Custom Integrated Circuits Conference}, pages 16.3/1-4. IEEE Comput. Soc. Press, May 1991. ISBN: 0-7883-0015-7. {\bf RAP -- Ring Array Processor }\\ N.Morgan, J.Beck, P.Kohn, J.Bilmes, E.Allman, and J.Beer.\\ The ring array processor: A multiprocessing peripheral for connectionist applications. \\ In {\it Journal of Parallel and Distributed Computing}, pages 248-259, April 1992. {\bf RENNS -- REconfigurable Neural Networks Server}\\ O.Landsverk, J.Greipsland, J.A.Mathisen, J.G.Solheim, and L.Utne.\\ RENNS - a Reconfigurable Computer System for Simulating Artificial Neural Network Algorithms.\\ In {\it Parallel and Distributed Computing Systems, Proceedings of the ISMM 5th International Conference}, pages 251-256. The International Society for Mini and Microcomputers - ISMM, October 1992. ISBN: 1-8808-4302-1. {\bf SMART -- Sparse Matrix Adaptive and Recursive Transforms}\\ P.Bessiere, A.Chams, A.Guerin, J.Herault, C.Jutten, and J.C.Lawson.\\ From Hardware to Software: Designing a ``Neurostation''.\\ In {\it VLSI design of Neural Networks}, pages 311-335, June 1990. {\bf SNAP -- Scalable Neurocomputer Array Processor} E.Wojciechowski.\\ SNAP: A parallel processor for implementing real time neural networks.\\ In {\it Proceedings of the IEEE 1991 National Aerospace and Electronics Conference; NAECON-91}, volume 2, pages 736-742. IEEE Comput.Soc.Press, May 1991. {\bf Toroidal Neural Network Processor}\\ S.Jones, K.Sammut, C.Nielsen, and J.Staunstrup.\\ Toroidal Neural Network: Architecture and Processor Granularity Issues.\\ In {\it VLSI design of Neural Networks}, pages 229-254, June 1990. {\bf SMART and SuperNode} P. Bessi`ere, A. Chams, and P. Chol. MENTAL : A virtual machine approach to artificial neural networks programming. In NERVES, ESPRIT B.R.A. project no 3049, 1991. \subsubsection{Standard Computers} {\bf EMMA-2}\\ R.Battiti, L.M.Briano, R.Cecinati, A.M.Colla, and P.Guido.\\ An application oriented development environment for Neural Net models on multiprocessor Emma-2.\\ In {\it Silicon Architectures for Neural Nets; Proceedings for the IFIP WG.10.5 Workshop}, pages 31-43. North Holland, November 1991. ISBN: 0-4448-9113-7. {\bf iPSC/860 Hypercube}\\ D.Jackson, and D.Hammerstrom\\ Distributing Back Propagation Networks Over the Intel iPSC/860 Hypercube}\\ In {\it IJCNN91-Seattle: International Joint Conference on Neural Networks}, volume 1, pages 569-574. IEEE Comput. Soc. Press, July 1991. ISBN: 0-7083-0164-1. {\bf SCAP -- Systolic/Cellular Array Processor}\\ Wei-Ling L., V.K.Prasanna, and K.W.Przytula.\\ Algorithmic Mapping of Neural Network Models onto Parallel SIMD Machines.\\ In {\it IEEE Transactions on Computers}, 40(12), pages 1390-1401, December 1991. ISSN: 0018-9340.
------------------------------------------------------------------------
For example, in robotics (DeMers and Kreutz-Delgado, 1996, 1997), X might describe the positions of the joints in a robot's arm, while Y would describe the location of the robot's hand. There are simple formulas to compute the location of the hand given the positions of the joints, called the "forward kinematics" problem. But there is no simple formula for the "inverse kinematics" problem to compute positions of the joints that yield a given location for the hand. Furthermore, if the arm has several joints, there will usually be many different positions of the joints that yield the same location of the hand, so the forward kinematics function is many-to-one and has no unique inverse. Picking any X such that Y = f(X) is OK if the only aim is to position the hand at Y. However if the aim is to generate a series of points to move the hand through an arc this may be insufficient. In this case the series of Xs need to be in the same "branch" of the function space. Care must be taken to avoid solutions that yield inefficient or impossible movements of the arm.
As another example, consider an industrial process in which X represents settings of control variables imposed by an operator, and Y represents measurements of the product of the industrial process. The function Y = f(X) can be learned by a NN using conventional training methods. But the goal of the analysis may be to find control settings X that yield a product with specified measurements Y, in which case an inverse of f(X) is required. In industrial applications, financial considerations are important, so not just any setting X that yields the desired result Y may be acceptable. Perhaps a function can be specified that gives the cost of X resulting from energy consumption, raw materials, etc., in which case you would want to find the X that minimizes the cost function while satisfying the equation Y = f(X).
The obvious way to try to learn an inverse function is to generate a set of training data from a given forward function, but designate Y as the input and X as the output when training the network. Using a least-squares error function, this approach will fail if f() is many-to-one. The problem is that for an input Y, the net will not learn any single X such that Y = f(X), but will instead learn the arithmetic mean of all the Xs in the training set that satisfy the equation (Bishop, 1995, pp. 207-208). One solution to this difficulty is to construct a network that learns a mixture approximation to the conditional distribution of X given Y (Bishop, 1995, pp. 212-221). However, the mixture method will not work well in general for an X vector that is more than one-dimensional, such as Y = X_1^2 + X_2^2, since the number of mixture components required may increase exponentially with the dimensionality of X. And you are still left with the problem of extracting a single output vector from the mixture distribution, which is nontrivial if the mixture components overlap considerably. Another solution is to use a highly robust error function, such as a redescending M-estimator, that learns a single mode of the conditional distribution instead of learning the mean (Huber, 1981; Rohwer and van der Rest 1996). Additional regularization terms or constraints may be required to persuade the network to choose appropriately among several modes, and there may be severe problems with local optima.
Another approach is to train a network to learn the forward mapping f() and then numerically invert the function. Finding X such that Y = f(X) is simply a matter of solving a nonlinear system of equations, for which many algorithms can be found in the numerical analysis literature (Dennis and Schnabel 1983). One way to solve nonlinear equations is turn the problem into an optimization problem by minimizing sum(Y_i-f(X_i))^2. This method fits in nicely with the usual gradient-descent methods for training NNs (Kindermann and Linden 1990). Since the nonlinear equations will generally have multiple solutions, there may be severe problems with local optima, especially if some solutions are considered more desirable than others. You can deal with multiple solutions by inventing some objective function that measures the goodness of different solutions, and optimizing this objective function under the nonlinear constraint Y = f(X) using any of numerous algorithms for nonlinear programming (NLP; see Bertsekas, 1995, and other references under "What are conjugate gradients, Levenberg-Marquardt, etc.?") The power and flexibility of the nonlinear programming approach are offset by possibly high computational demands.
If the forward mapping f() is obtained by training a network, there will generally be some error in the network's outputs. The magnitude of this error can be difficult to estimate. The process of inverting a network can propagate this error, so the results should be checked carefully for validity and numerical stability. Some training methods can produce not just a point output but also a prediction interval (Bishop, 1995; White, 1992). You can take advantage of prediction intervals when inverting a network by using NLP methods. For example, you could try to find an X that minimizes the width of the prediction interval under the constraint that the equation Y = f(X) is satisfied. Or instead of requiring Y = f(X) be satisfied exactly, you could try to find an X such that the prediction interval is contained within some specified interval while minimizing some cost function.
For more mathematics concerning the inverse-function problem, as well as some interesting methods involving self-organizing maps, see DeMers and Kreutz-Delgado (1996, 1997). For NNs that are relatively easy to invert, see the Adaptive Logic Networks described in the software sections of the FAQ.
References:
Bertsekas, D. P. (1995), Nonlinear Programming, Belmont, MA: Athena Scientific.
Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford: Oxford University Press.
DeMers, D., and Kreutz-Delgado, K. (1996), "Canonical Parameterization of Excess motor degrees of freedom with self organizing maps", IEEE Trans Neural Networks, 7, 43-55.
DeMers, D., and Kreutz-Delgado, K. (1997), "Inverse kinematics of dextrous manipulators," in Omidvar, O., and van der Smagt, P., (eds.) Neural Systems for Robotics, San Diego: Academic Press, pp. 75-116.
Dennis, J.E. and Schnabel, R.B. (1983) Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall
Huber, P.J. (1981), Robust Statistics, NY: Wiley.
Kindermann, J., and Linden, A. (1990), "Inversion of Neural Networks by Gradient Descent," Parallel Computing, 14, 277-286, ftp://icsi.Berkeley.EDU/pub/ai/linden/KindermannLinden.IEEE92.ps.Z
Rohwer, R., and van der Rest, J.C. (1996), "Minimum description length, regularization, and multimodal data," Neural Computation, 8, 595-609.
White, H. (1992), "Nonparametric Estimation of Conditional Quantiles Using Neural Networks," in Page, C. and Le Page, R. (eds.), Proceedings of the 23rd Sympsium on the Interface: Computing Science and Statistics, Alexandria, VA: American Statistical Association, pp. 190-199.
------------------------------------------------------------------------
Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford: Oxford University Press, section 8.7.
Masters, T. (1994), Signal and Image Processing with Neural Networks: A C++ Sourcebook, NY: Wiley.
Soucek, B., and The IRIS Group (1992), Fast Learning and Invariant Object Recognition, NY: Wiley.
------------------------------------------------------------------------
A GA is an optimization program that starts with a population of encoded procedures, (Creation of Life :-> ) mutates them stochastically, (Get cancer or so :-> ) and uses a selection process (Darwinism) to prefer the mutants with high fitness and perhaps a recombination process (Make babies :-> ) to combine properties of (preferably) the succesful mutants.Genetic algorithms are just a special case of the more general idea of "evolutionary computation". There is a newsgroup that is dedicated to the field of evolutionary computation called comp.ai.genetic. It has a detailed FAQ posting which, for instance, explains the terms "Genetic Algorithm", "Evolutionary Programming", "Evolution Strategy", "Classifier System", and "Genetic Programming". That FAQ also contains lots of pointers to relevant literature, software, other sources of information, et cetera et cetera. Please see the comp.ai.genetic FAQ for further information.
There is a web page on "/Neural Network Using Genetic Algorithms" by Omri Weisman and Ziv Pollack at http://www.cs.bgu.ac.il/~omri/NNUGA/
Andrew Gray's Hybrid Systems FAQ at the University of Otago at http://divcom.otago.ac.nz:800/COM/INFOSCI/SMRL/people/andrew/publications/faq/hybrid/hybrid.htm also has links to information on neuro-genetic methods.
For general information on GAs, try the links at http://www.shef.ac.uk/~gaipp/galinks.html and http://www.cs.unibo.it/~gaioni
------------------------------------------------------------------------
Fuzzy logic is used where a system is difficult to model exactly (but an inexact model is available), is controlled by a human operator or expert, or where ambiguity or vagueness is common. A typical fuzzy system consists of a rule base, membership functions, and an inference procedure.
Most fuzzy logic discussion takes place in the newsgroup comp.ai.fuzzy (where there is a fuzzy logic FAQ) but there is also some work (and discussion) about combining fuzzy logic with neural network approaches in comp.ai.neural-nets.
Early work combining neural nets and fuzzy methods used competitive networks to generate rules for fuzzy systems (Kosko 1992). This approach is sort of a crude version of bidirectional counterpropagation (Hecht-Nielsen 1990) and suffers from the same deficiencies. More recent work (Brown and Harris 1994; Kosko 1997) has been based on the realization that a fuzzy system is a nonlinear mapping from an input space to an output space that can be parameterized in various ways and therefore can be adapted to data using the usual neural training methods (see "What is backprop?") or conventional numerical optimization algorithms (see "What are conjugate gradients, Levenberg-Marquardt, etc.?").
A neural net can incorporate fuzziness in various ways:
Bezdek, J.C. (1981), Pattern Recognition with Fuzzy Objective Function Algorithms, New York: Plenum Press.
Bezdek, J.C. & Pal, S.K., eds. (1992), Fuzzy Models for Pattern Recognition, New York: IEEE Press.
Brown, M., and Harris, C. (1994), Neurofuzzy Adaptive Modelling and Control, NY: Prentice Hall.
Carpenter, G.A. and Grossberg, S. (1996), "Learning, Categorization, Rule Formation, and Prediction by Fuzzy Neural Networks," in Chen, C.H. (1996), pp. 1.3-1.45.
Chen, C.H., ed. (1996) Fuzzy Logic and Neural Network Handbook, NY: McGraw-Hill, ISBN 0-07-011189-8.
Dierckx, P. (1995), Curve and Surface Fitting with Splines, Oxford: Clarendon Press.
Hecht-Nielsen, R. (1990), Neurocomputing, Reading, MA: Addison-Wesley.
Klir, G.J. and Folger, T.A.(1988), Fuzzy Sets, Uncertainty, and Information, Englewood Cliffs, N.J.: Prentice-Hall.
Kosko, B.(1992), Neural Networks and Fuzzy Systems, Englewood Cliffs, N.J.: Prentice-Hall.
Kosko, B. (1997), Fuzzy Engineering, NY: Prentice Hall.
Lampinen, J and Selonen, A. (1996), "Using Background Knowledge for Regularization of Multilayer Perceptron Learning", Submitted to International Conference on Artificial Neural Networks, ICANN'96, Bochum, Germany.
Lippe, W.-M., Feuring, Th. and Mischke, L. (1995), "Supervised learning in fuzzy neural networks," Institutsbericht Angewandte Mathematik und Informatik, WWU Muenster, I-12, http://wwwmath.uni-muenster.de/~feuring/WWW_literatur/bericht12_95.ps.gz
van Rijckevorsal, J.L.A. (1988), "Fuzzy coding and B-splines," in van Rijckevorsal, J.L.A., and de Leeuw, J., eds., Component and Correspondence Analysis, Chichester: John Wiley & Sons, pp. 33-54.
------------------------------------------------------------------------
------------------------------------------------------------------------
Links to neurosci, psychology, linguistics lists are also provided.
ftp://una.hh.lib.umich.edu/inetdirsstacks/neurosci:cormbonario, gopher://una.hh.lib.umich.edu/00/inetdirsstacks/neurosci:cormbonario, http://http2.sils.umich.edu/Public/nirg/nirg1.html.
------------------------------------------------------------------------That's all folks (End of the Neural Network FAQ).
Acknowledgements: Thanks to all the people who helped to get the stuff above into the posting. I cannot name them all, because I would make far too many errors then. :-> No? Not good? You want individual credit? OK, OK. I'll try to name them all. But: no guarantee.... THANKS FOR HELP TO: (in alphabetical order of email adresses, I hope)
Bye Warren & LutzPrevious part is part 6.
Neural network FAQ / Warren S. Sarle, saswss@unx.sas.com