Papers by Research Area


Research Areas:


Budgeted Learning

Some machine learning applications have training data where the labels are available but the attributes describing the examples must be purchased. This problem is, in a sense, a dual to active learning. Budgeted learning algorithms attempt to intelligently choose the attributes to purchase in order to learn well with as few purchased attributes as possible.

Kun Deng, Chris Bourke, Stephen Scott, Julie Sunderman, and Yaling Zheng. Bandit-Based Algorithms for Budgeted Learning. In Proceedings of the Seventh IEEE International Conference on Data Mining. October 2007, pages 463-468.
Abstract
paper (849Kb PDF)


Active Learning

Some machine learning applications have abundant amounts of unlabeled data with an oracle that is capable of labeling a relatively small number of these examples to be used in supervised training. Active learning algorithms attempt to intelligently choose the examples to label in order to learn well with as few labeled examples as possible.

Matt Culver, Deng Kun, and Stephen Scott. Active Learning to Maximize Area Under the ROC Curve. In Proceedings of the Sixth IEEE International Conference on Data Mining. December 2006, pages 149-158.
Abstract
paper (146Kb PDF)

Thomas Osugi, Deng Kun, and Stephen Scott. Balancing Exploration and Exploitation: A New Algorithm for Active Machine Learning. In Proceedings of the Fifth IEEE International Conference on Data Mining, pages 330-337. November 2005.
Abstract
paper (340Kb PDF)


Receiver Operating Characteristic (ROC) Analysis

Receiver Operating Characteristic (ROC) analysis is an alternative means of measuring classifier peroformance. We study mechanisms of adjusting multi-class classifiers to improve performance (with respect to its ROC hypersurface) when faced with nonuniform misclassification costs.

Chris Bourke, Kun Deng, Stephen D. Scott, Robert Schapire, and N. V. Vinodchandran. On reoptimizing multi-class classifiers. Machine Learning, 71(2–3):219–242. doi:10.1007/s10994-008-5056-8, 2008.
Abstract
On-line version

Kun Deng, Chris Bourke, Stephen Scott, and N. V. Vinodchandran. New Algorithms for Optimizing Multi-Class Classifiers via ROC Surfaces. In Proceedings of The Third Workshop on ROC Analysis in Machine Learning, pages 17-24, June 2006.
Abstract
264Kb PDF




Making Exponential-Time Learning Algorithms Efficient

Linear threshold-based learning algorithms that use multiplicative weight updates (e.g. Weighted Majority and Winnow) can guarantee low error rates even when given an exponential number of inputs. However, these algorithms will not run efficiently unless special techniques are used to compute the weighted sums of the inputs in polynomial time. Our work, previously supported by a CAREER grant from the National Science Foundation, investigated such techniques.

Using Markov chain Monte Carlo techniques to estimate weighted sums in Winnow and Weighted Majority

Qingping Tao and Stephen D. Scott. Improved MCMC Sampling Methods for Estimating Weighted Sums in Winnow with Application to DNF Learning. Machine Learning, 73(2):107–132. doi:10.1007/s10994-008-5063-9.
Abstract
On-line version

Qingping Tao and Stephen D. Scott. An Analysis of MCMC Sampling Methods for Estimating Weighted Sums in Winnow. Technical report UNL-CSE-2004-0007, University of Nebraska, 2004.
Abstract
214Kb PDF

Qingping Tao and Stephen Scott. An analysis of MCMC sampling methods for estimating weighted sums in Winnow. In Intelligent Engineering Systems Through Artificial Neural Networks (ANNIE 2003), Volume 13, pages 15-20, St. Louis, MO, November 2003.
Abstract
98Kb PDF

Deepak Chawla, Lin Li, and Stephen D. Scott. On approximating weighted sums with exponentially many terms. Journal of Computer and System Sciences, 69(2):196-234, 2004.

Deepak Chawla, Lin Li, and Stephen D. Scott. On approximating weighted sums with exponentially many terms. Technical report UNL-CSE-2003-1, University of Nebraska, 2003.
Abstract
350Kb PDF

Deepak Chawla, Lin Li, and Stephen D. Scott. Efficiently approximating weighted sums with exponentially many terms. In Proceedings of the Fourteenth Annual Conference on Computational Learning Theory, pages 82-98, Trippenhuis, Amsterdam, the Netherlands, July 2001.
Abstract

Using the virtual weights technique to exactly compute weighted sums in Winnow and Exponentiated Gradient. Using approximation algorithms and heuristics in multiple-instance learning. (Follow links to paper descriptions.)


Learning Geometric Patterns/Multiple-Instance Learning

Here we study the problem of learning geometric patterns and explore our algorithms' applicability to pattern recognition problems, including landmark matching and content-based image retrieval. The problem of learning geometric patterns is a generalization of the multiple-instance learning model in that more complex functions are used to label a bag of instances.

Generalized multiple-instance learning with applications to vision, content-based image retrieval, drug discovery, and protein sequence modeling

Qingping Tao, Stephen Scott, N. V. Vinodchandran, Thomas Osugi and Brandon Mueller. Kernels for Generalized Multiple-Instance Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(12):2084–2097, December 2008. doi 10.1109/TPAMI.2007.70846.
Abstract
IEEE Entry

Qingping Tao, Stephen Scott, N. V. Vinodchandran, Thomas Osugi and Brandon Mueller. An Extended Kernel for Generalized Multiple-Instance Learning. In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence , pages 272-277, Boca Raton, Florida, November 2004.
Abstract
104Kb PDF

Qingping Tao, Stephen Scott, N. V. Vinodchandran, and Thomas Osugi. SVM-Based Generalized Multiple-Instance Learning via Approximate Box Counting. In Proceedings of the Twenty-First International Conference on Machine Learning, pages 799-806, Banff, Alberta, Canada, July 2004.
Abstract
137Kb PDF

Qingping Tao and Stephen Scott. A faster algorithm for generalized multiple-instance learning. In Proceedings of the 17th International Florida Artificial Intelligence Research Society Conference (FLAIRS), pages 550-555, 2004.
Abstract
112Kb PDF

Stephen D. Scott, Jun Zhang, and Joshua Brown. On Generalized Multiple-Instance Learning. International Journal of Computational Intelligence and Applications, 5(1):21-35, March 2005.
Abstract

Stephen D. Scott, Jun Zhang, and Joshua Brown. On Generalized Multiple-Instance Learning. Technical report UNL-CSE-2003-5, University of Nebraska, 2003.
Abstract
449Kb PDF

Overview paper (few details)

Stephen D. Scott. Geometric patterns: Algorithms and applications. In the ICML 2000 Workshop on Machine Learning of Spatial Knowledge, pages 109-115, Palo Alto, California, July 2000.
Abstract
92Kb compressed PostScript
142Kb PDF
Slides: 63Kb compressed PostScript, 121Kb PDF

A d-dimensional algorithm based on Exponentiated Gradient (EG) and Gradient Descent

Sally A. Goldman and Stephen D. Scott. Multiple-instance learning of real-valued geometric patterns. Annals of Mathematics and Artificial Intelligence, 39(3):259-290, November 2003.
Abstract
Kluwer Online entry

Sally A. Goldman and Stephen D. Scott. Multiple-instance learning of real-valued geometric patterns. Technical report UNL-CSE-99-006, University of Nebraska, 2000.
Abstract
141Kb compressed PostScript
316Kb PDF

d-dimensional algorithms based on Littlestone's Winnow

Stephen Scott. Agnostic Learning of General Geometric Patterns and Multi-Instance Learning in ℝd. In Proceedings of The 2004 International Conference on Machine Learning and Applications (ICMLA '04), pages 192-199, Louisville, Kentucky, December 2004.
Abstract
137Kb PDF

Sally A. Goldman, Stephen S. Kwek and Stephen D. Scott. Agnostic learning of geometric patterns. Journal of Computer and System Sciences, 62(1):123-151, February 2001.
Abstract

Sally A. Goldman, Stephen S. Kwek and Stephen D. Scott. Agnostic learning of geometric patterns. Technical report WUCS-98-27, Washinton University in St. Louis, 1998.
Abstract
120Kb compressed PostScript

Sally A. Goldman, Stephen S. Kwek and Stephen D. Scott. Agnostic learning of geometric patterns. In Proceedings of the Tenth Annual ACM Conference on Computational Learning Theory, pages 325-333, Nashville, Tennessee, July 1997.
Abstract
62Kb compressed PostScript
Slides:


A 1-dimensional algorithm based on the statistical query model

Sally A. Goldman and Stephen D. Scott. A theoretical and empirical study of a noise-tolerant algorithm to learn geometric patterns. Machine Learning, 37(1):5-49, October 1999.
Abstract

Sally A. Goldman and Stephen D. Scott. A theoretical and empirical study of a noise-tolerant algorithm to learn geometric patterns. Technical report WUCS-97-20, Washinton University in St. Louis, 1997.
Abstract
319Kb compressed PostScript

Sally A. Goldman and Stephen D. Scott. A theoretical and empirical study of a noise-tolerant algorithm to learn geometric patterns. In Proceedings of ICML '96: The 13th International Conference on Machine Learning, pages 191-199, Bari, Italy, July 1996.
Abstract
70Kb compressed PostScript
Slides:


A 1-dimensional (noise intolerant) Occam algorithm

Paul W. Goldberg, Sally A. Goldman and Stephen D. Scott. PAC-learning of one-dimensional patterns. Machine Learning, 25(1):51-70, October 1996.
Abstract

Paul W. Goldberg, Sally A. Goldman and Stephen D. Scott. PAC-learning of one-dimensional patterns. Technical report WUCS-95-10, Washinton University in St. Louis, 1995.
Abstract
102Kb compressed PostScript

Most of the above

Stephen D. Scott. Exploring Applications of Learning Theory to Pattern Matching and Dynamic Adjustment of TCP Acknowledgment Delays. Doctoral thesis, Washington University in St. Louis, August 1998.
Abstract
Entire thesis: 1200Kb compressed PostScript
Part 1 only (ToC, Intro, and pattern matching): 1056Kb compressed PostScript
Part 2 only (TCP acknowledgment delays, references, vita): 203Kb compressed PostScript
Slides:


Automatic Annotation of Web Pages

Here we give a framework to automatically annotate web pages with meta-information. The framework, called LASSO (formerly AutoSHOE), is very extensible in that one can easily acquire new training data, plug in new learning algorithms, label new web pages, and verify these labels.

Christopher Hammack and Stephen Scott. LASSO: A Learning Architecture for Semantic Web Ontologies. In Proceedings of The 2004 International Conference on Machine Learning and Applications (ICMLA '04), pages 10-17, Louisville, Kentucky, December 2004.
Abstract
PDF
LASSO web site

QingFeng Lin, Stephen D. Scott, and Sharad C. Seth. A machine learning framework for automatically annotating web pages with Simple HTML Ontology Extension (SHOE). In Proceedings of the International Conference on Intelligent Agents, Web Technology and Internet Commerce, pages 303-310, Las Vegas, NV, July 2001.
Abstract
91Kb compressed PostScript
254Kb PDF


Piecewise Regression

Here we present a model and an algorithm (EMPRR) for general regression of high-dimensional piecewise-defined functions.

Manimozhiyan Arumugam and Stephen Scott. EMPRR: A High-Dimensional EM-Based Piecewise Regression Algorithm. In Proceedings of The 2004 International Conference on Machine Learning and Applications (ICMLA '04), pages 264-271, Louisville, Kentucky, December 2004.
Abstract
PDF


Bioinformatics

Here we introduce a new method to simulate realistic evolutionary processes of protein sequences with insertions and deletions (indels). Applications include enhancements of evolutionary methods such as multiple alignments and phylogenetic methods.

Cory Strope, Stephen D. Scott, and Etsuko Moriyama. indel-Seq-Gen: a new protein family simulator incorporating domains, motifs, and indels. Molecular Biology and Evolution, 24(3):640-649. doi:10.1093/molbev/msl195, 2007.
Abstract
On-line version


Here we introduce new similarity measures for analyzing 3D protein structures. Applications include finding new structural motifs and identifying new proteins in various superfamilies.

Chang Wang and Stephen Scott. New Kernels for Protein Structural Motif Discovery and Function Classification. In Proceedings of the Twenty-Second International Conference on Machine Learning, pages 945-952. Bonn, Germany, August 2005.
Abstract
paper (203Kb PDF)
slides (449Kb PDF)

Here we introduce new distance measures for the construction and analysis of phylogenies, focusing on thioredoxin-fold proteins.

Chang Wang, Stephen Scott Qingping Tao, Dmitri Fomenko, and Vadim Gladyshev. New Techniques for Generation and Analysis of Evolutionary Trees. In Proceedings of The 2004 International Conference on Mathematics and Engineering Techniques in Medicine and Biological Sciences (METMBS'04), pages 283-289, Las Vegas, NV, June 2004.
Abstract
145Kb PDF

Here we develop novel methods for searching for new proteins in the thioredoxin-fold superfamily. This superfamily has very low primary sequence conservation, making conventional (hidden Markov model-based) techniques difficult to apply.

Chang Wang, Stephen D. Scott, Jun Zhang, Qingping Tao, Dmitri E. Fomenko, and Vadim N. Gladyshev. A Study in Modeling Low-Conservation Protein Superfamilies. Technical report UNL-CSE-2004-0003, University of Nebraska, 2004.
Abstract
115Kb PDF

Stephen D. Scott, Haifeng Ji, Peggy Wen, Dimitri E. Fomenko, and Vadim N. Gladyshev. On Modeling Protein Superfamilies with Low Primary Sequence Conservation. Technical report UNL-CSE-2003-4, University of Nebraska, 2003.
Abstract
114Kb PDF


Optical Networks

Here we study various problems in circuit construction, traffic grooming, and switch design in all-optical networks.

Jitender Deogun, Zsolt Tuza, Stephen Scott, and Lin Li. Weighted edge-decompositions of graphs. Journal of Combinatorial Mathematics and Combinatorial Computing, 53:197-208, 2005.

Lan Kong, Lin Li, Jitender Deogun, and Stephen Scott. Wavelength Assignment for Light-Tree Protection in WDM Optical Networks. In Proceedings of The Fifth IASTED International Conference on Wireless and Optical Communications, pages 58-63, Banff, Canada, July 2005.
Abstract
114Kb PDF

Lin Li, Jitender Deogun, and Stephen Scott. Performance analysis of optical packet switches with a hybrid buffering architecture [Invited Paper]. The Journal of Optical Networking, 3(6):433-449, 2004.
Abstract
JON Entry

Lin Li, Stephen Scott, and Jitender Deogun. A novel fiber delay line buffering architecture for optical packet switching. In Proceedings of the IEEE 2003 Global Communications Conference (GLOBECOM 2003), pages 2809-2813, San Francisco, CA, December 2003.
Abstract
118Kb PDF

Lin Li, Stephen Scott, and Jitender Deogun. Performance analysis of WDM optical packet switches with a hybrid buffering architecture. In Proceedings of the Fourth Annual Optical Networking and Communications conference (OptiComm 2003), pages 346-356, Dallas, TX, October 2003.
Abstract
166Kb PDF

Lin Li, Stephen Scott, and Jitender Deogun. Cost-effective approaches for circuit construction in WDM SONET rings. In Proceedings of IASTED International Conference--Wireless and Optical Communications, pages 333-338, Banff, Canada, July 2002.
Abstract
117Kb PDF


Dynamically Adjusting TCP Acknowledgment Delays

Here we study the problem of dynamically adjusting the delay period of acknowledgments in the Transmission Control Protocol (TCP). We take an algorithmic approach and supplement it with learning.

Daniel R. Dooly, Sally A. Goldman, and Stephen D. Scott. On-line analysis of the TCP acknowledgment delay problem. Journal of the ACM, 48(2):243-273, March 2001.
Abstract
ACM entry

Daniel R. Dooly, Sally A. Goldman, and Stephen D. Scott. TCP dynamic acknowledgment delay: Theory and practice. Technical report WUCS-98-29, Washinton University in St. Louis, 1998.
Abstract
240Kb compressed PostScript

Daniel R. Dooly, Sally A. Goldman, and Stephen D. Scott. TCP dynamic acknowledgment delay: Theory and practice. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, pages 389-398, Dallas, Texas, May 1998.
Abstract
108Kb compressed PostScript
Slides:


Stephen D. Scott. Exploring Applications of Learning Theory to Pattern Matching and Dynamic Adjustment of TCP Acknowledgment Delays. Doctoral thesis, Washington University in St. Louis, August 1998.
Abstract
Entire thesis: 1200Kb compressed PostScript
Part 1 only (ToC, Intro, and pattern matching): 1056Kb compressed PostScript
Part 2 only (TCP acknowledgment delays, references, vita): 203Kb compressed PostScript
Slides:


Learning with Unspecified Attribute Values

Sally A. Goldman, Stephen S. Kwek and Stephen D. Scott. Learning from examples with unspecified attribute values. Information and Computation, 180(2):82-100, 2003.
Abstract

Sally A. Goldman, Stephen S. Kwek and Stephen D. Scott. Learning from examples with unspecified attribute values. Technical report WUCS-98-28, Washinton University in St. Louis, 1998.
Abstract
100Kb compressed PostScript

Sally A. Goldman, Stephen S. Kwek and Stephen D. Scott. Learning from examples with unspecified attribute values. In Proceedings of the Tenth Annual ACM Conference on Computational Learning Theory, pages 231-242, Nashville, Tennessee, July 1997.
Abstract
70Kb compressed PostScript


Hardware-Based Genetic Algorithms

Here we exploit the reconfigurability of SRAM-based field-programmable gate arrays (FPGAs) to implement a genetic algorithm (GA) in hardware. The reconfigurable hardware element makes a hardware-based GA feasible since we can change the fitness function at run time.

Stephen D. Scott, Sharad Seth, and Ashok Samal. A synthesizable VHDL coding of a genetic algorithm. In Lance Chambers, editor, The Practical Handbook of Genetic Algorithms, Volume III: Complex Coding Systems. CRC Press, Inc., 1998. ISBN: 0849325390. Includes detailed description of functionality of the HGA's VHDL code.
Abstract

Stephen D. Scott, Sharad Seth, and Ashok Samal. A synthesizable VHDL coding of a genetic algorithm. Technical report UNL-CSE-97-009, University of Nebraska, November 1997. Includes detailed description of functionality of the HGA's VHDL code.
Abstract
141Kb compressed PostScript

Stephen D. Scott, Sharad Seth, and Ashok Samal. A Hardware Engine for Genetic Algorithms. Technical report UNL-CSE-97-001, University of Nebraska, July 1997.
Abstract
HTML
346Kb compressed PostScript

Stephen D. Scott. HGA v1.3.1: VHDL source code for the HGA design. Same as what appears in Appendix A of the thesis, but with a few improvements.

Stephen D. Scott, Ashok Samal and Sharad Seth. HGA: A hardware-based genetic algorithm. In Proceedings of the 1995 ACM/SIGDA Third International Symposium on Field-Programmable Gate Arrays, pages 53-59, Monterey, CA, February 1995. Also in WSC1: The 1st Online Workshop on Soft Computing in the Special Session on Evolutionary Electronics.
Abstract
HTML
48Kb compressed PostScript
Slides:

Stephen D. Scott. HGA: A Hardware-Based Genetic Algorithm. Master's thesis, University of Nebraska, August 1994. Also in technical report UNL-CSE-94-020, University of Nebraska.
Abstract
445Kb compressed PostScript
Slides:


Miscellaneous Genetic Algorithms

Stephen D. Scott, Ashok Samal and Trent Bills. Genetic algorithms and general constraints. Unpublished manuscript, 1993.
Abstract
41Kb compressed PostScript

Genetic algorithms. Graduate Student Colloquium, Computer Science Department, University of Nebraska. September 14, 1993.
38Kb compressed PostScript


To Stephen D. Scott's home page


Last modified 30 May 2008.