`Department of Computer Science, Bren Hall 4216
`School of Information and Computer Sciences
`University of California, Irvine
`CA 92697-3435
`telephone: (949) 824 2558
`fax: (949) 824 4056
`email: smyth@ics.uci.edu
`
`Professional Positions
`
`April 1996–present: Professor, Department of Computer Science, University of California, Irvine
`• Chancellor’s Professor: 2018 to present
`• Full Professor: July 2003 to 2018
`• Associate Professor: July 1998 to June 2003
`• Assistant Professor: April 1996 to June 1998
`
`October 1988–March 1996: Member of Technical Staff and Technical Group Leader (from 1992), Jet
`Propulsion Laboratory, California Institute of Technology, Pasadena.
`
`Education
`
`PhD, 1988: California Institute of Technology, Department of Electrical Engineering.
`
`MSEE, 1985: California Institute of Technology, Department of Electrical Engineering.
`
`BE, 1984: National University of Ireland, University College Galway. Bachelor of Engineering (Electronic)
`with First-Class Honors.
`
`Additional Professional Roles and Affiliations
`
`Joint Faculty Appointments:
`• Department of Statistics, UC Irvine, July 2008–present.
`• Department of Education, UC Irvine, July 2017–present.
`
`Founding Director, UCI Data Science Initiative, University of California, Irvine, July 2014–June 2018.
`
`Founding Director, Center for Machine Learning and Intelligent Systems, University of California, Irvine,
`January 2007–June 2014.
`
`Faculty Member, Institute for Genomics and Bioinformatics (IGB), UC Irvine, Member 2001–present.
`
`Faculty Member, Institute for Mathematical Behavioral Sciences (IMBS), UC Irvine, 1999-present.
`
`Faculty Member, Center for Digital Transformation, UC Irvine, 2012–present.
`
`Faculty Member, Program for Mathematical, Computational, and Systems Biology (MCB), UC Irvine,
`2007–present.
`
`Faculty Member, Center for Research on Information Technology and Organizations (CRITO), UC Irvine,
`2008–2012.
`
`Founding Director and Executive Committee Member of the ACM Special Interest Group on Knowledge
`Discovery and Data Mining (SIGKDD), 1998.
`
`1
`
`GOOGLE 1003
`
`
`
`Visiting Principal Researcher, Jet Propulsion Laboratory, California Institute of Technology, Pasadena,
`1996–2001.
`
`Member of IEEE (1988–present), American Statistical Association (1997–present), and the Association for
`Computing Machinery (ACM) (1999–present).
`
`Honors and Awards
`
`Fellow, American Association for the Advancement of Science (AAAS), elected 2022
`
`Fellow, Association for Computing Machinery (ACM), elected 2013
`
`Fellow, Association for the Advancement of Artificial Intelligence (AAAI), elected 2010
`
`ACM SIGKDD Innovation Award, 2009
`
`Best paper awards: ACM SIGKDD Conference (best paper(1997, 2002), runner-up best paper (1998, 2000)),
`ACM/IEEE Joint Conference on Digital Libraries (JCDL) (shortlist for best paper, 2007), Educational
`Data Mining Conference (best paper, 2018)
`
`Qualcomm Faculty Award, 2019
`
`Google Faculty Research Awards, 2008 and 2014
`
`IBM Faculty Partnership Award, 2001.
`
`National Science Foundation CAREER award, 1997
`
`ACM Teaching Award, UC Irvine, 1997
`
`NASA Group Achievement award, Jet Propulsion Labaratory, 1997
`
`Lew Allen Award for Excellence in Research, Jet Propulsion Laboratory, 1993
`
`17 NASA Certificates for Technical Innovation (1991–1996)
`
`Advisory and Consulting Activities
`
`AdvanceOC Advisory Board (2020-present); Candor Technologies (2021-present); Wilson, Sonsoni, Goodrich
`and Rosati (2019-present); Fox Rothschold LLP (2021); Fish and Richardson (2021; Fenwick and West LLP
`(2019-2021); QuinnEmanuel LLP (2019-2020); Morgan Lewis and Bockius LLP (2019); Erise IP (2017-
`2018; Toshiba (2018-2019), First American (2018-2019); ProLung, Inc (2017-2019); Unified Patents (2016-
`2019); University of Washington (2016-2019); Klarquist LLP (2015-2016); Frost Data Capital (2014-2015);
`AST Inc (2013-2015); Samsung (2012-2015); SOCCCD (2012-present); DigitalRisk (2010-2012); CoreLogic
`(2011-2014); IdentityMetrics (2010-2012); Microsoft (2010-2011); ImageCat (2010); eBay (2009-2011); Data-
`Analytics LLC (2009-2011); QuinnEmanuel LLP (2011); Latham and Watkins (2008-2009, 2011); Netflix
`(2006-2009); Topicseek LLC (2005-2008); Yahoo! (2005-2008); Strativa (2005); IET (2004-2005); JWDi-
`rect (2001-2004); Credit Sciences (2000-2004); Nokia Research (2000); First Quadrant Financial Services
`(1998-1999); Smith-Kline Beecham (1998); AT&T (1996-1998).
`
`Postdoctoral Advisees and Current Positions
`
`Tracy Holsclaw, 2011-2014; Consultant, San Jose, CA.
`Ralf Krestel, 2011-2013; Senior Researcher, Hasso-Plattner Institute, Potsdam, Germany.
`Romain Thibaux, 2008-2009; Google, Mountain View, CA.
`Alex Ihler, 2005-2006; Professor, Department of Computer Science, UC Irvine.
`Michael Duff, 2005-2006; Researcher, Fred Hutchinson Cancer Research Center, Seattle, WA.
`Michal Rosen-Zvi, 2003-2004; IBM Research, Israel.
`
`2
`
`
`
`PhD Students
`
`PhD Advisees and Current Positions
`
`Robert Logan IV (co-advised with Sameer Singh), PhD 2022; Dataminr, New York
`Disi Ji, PhD 2020; Instagram, Menlo Park, CA
`Chris Galbraith, PhD 2020; Mandiant, Philadelphia, PA
`Jihyun Park, PhD 2019; Apple, Cupertino, CA
`Dimitris Kotzias, PhD 2018; Google, Zurich
`Eric Nalisnick, PhD 2018; Assistant Professor, University of Amsterdam
`Moshe Lichman, PhD 2017; Google, Irvine, CA
`Nick Navaroli, PhD 2014; Google, Irvine, CA
`Jimmy Foulds, PhD 2014: Assistant Professor, Department of Computer Science, UMBC
`Chris DuBois, PhD 2013: Apple, Seattle
`America Chambers, PhD 2013: Assistant Professor, Department of Mathematics and Computer Science,
`University of Puget Sound
`Drew Frank (co-advised with Alex Ihler), PhD 2013: Apple, Seattle
`Arthur Asuncion, PhD 2011: Google, Seattle, WA
`Jon Hutchins (co-advised with Alex Ihler), PhD 2010: Google, Pittsburgh, PA
`Chaitanya Chemudugunta, PhD 2009: Director, Data Science/Research, Pandora, CA
`Seyoung Kim, PhD 2007: Associate Professor, Department of Bioinformatics, CMU, Pittsburgh
`Darya Chudova, PhD 2007: VP of Bioinformatics, Guardant Health, Redwood City, CA
`Sergey Kirshner, PhD 2005: Amazon, Palo Alto, CA
`Scott Gaffney, PhD 2004: VP of Search Engineering, eBay, San Jose, CA
`Xianping Ge, PhD 2002
`Igor V. Cadez, PhD 2002
`Dimitry Pavlov, Consultant, PhD 2001
`
`Current PhD Students
`
`Advanced to Candidacy: Preston Putzel, Casey Graff, Alex Boyd (co-advised with Stephan Mandt), Gavin
`Kerrigan
`Pre-Candidacy: Rachel Longjohn, Sam Showalter, Markelle Kelly, Edgar Robles, Catarina Belem, Yuxin
`Chang.
`
`Professional Activities
`
`Journals: Associate/Action Editor
`
`ACM Transactions on Knowledge Discovery and Data, guest editor of special issue on best papers from
`ACM SIGKDD 2011 Conference, TKDD 6(4), 2012.
`
`Journal of the American Statistical Association, 2002 to 2005.
`
`IEEE Transactions on Knowledge and Data Engineering, 2002 to 2004.
`
`Machine Learning Journal, July 1998 to December 2001.
`
`Machine Learning Journal, guest editor of special issue on probabilistic learning, 1997.
`
`Journals, Book Series, Centers: Advisory/Editorial Board Member
`
`Journal of Machine Learning Research, 2000-2020.
`
`Journal of Data Mining and Knowledge Discovery, 1997-present.
`
`3
`
`
`
`Chapman and Hall: Series in Computer Science and Data Analysis, 2002-2008.
`
`Bayesian Analysis, 2004-2007.
`
`Insight Center for Data Analytics, University College Dublin, Scientific Advisory Member, 2015-2020.
`
`Conference Program and General Chair Positions
`
`Associate Program Chair, International Joint Conference on Artificial Intelligence (IJCAI), 2022
`
`Program Chair for the Uncertainty in Artificial Intelligence (UAI) Conference, 2013.
`
`Program Chair for 17th ACM SIGKDD Conference, San Diego, 2011.
`
`Program Chair for the Symposium on the Interface between Statistics and Computing, Costa Mesa, CA,
`June 2001.
`
`General Chair for the Sixth International Conference on Artificial Intelligence and Statistics, January 1997.
`
`Other Conference and Workshop Organization Roles
`
`Conference Organization Roles: Senior Area Chair/Area Chair, NeurIPS 2017, 2018, 2019, 2020,2021;
`Senior Area Chair/Area Chair, ICML 2018, 2019, 2020, 2021,2022; Senior Area Chair, AAAI 2020;
`Panels Chair for ACM SIGKDD Fifth International Conference on Knowledge Discovery and Data
`Mining, 1999; Tutorials co-Chair for National Conference on Artificial Intelligence, 1998; Tutorials
`Chair for the ACM SIGKDD Conferences on Knowledge Discovery and Data Mining, 1997 and 1998;
`Publicity Chair for the ACM SIGKDD Conferences on Knowledge Discovery and Data Mining, 1995
`and 1996.
`
`Workshop Co-Chair/Organizer for: Dagstuhl Seminar, Automating Data Science, 2018; Workshop on Al-
`gorithmic and Statistical Approaches for Large Social Network Data Sets, NIPS Conference, Lake
`Tahoe, 2012; Workshop on User-Centered Modeling, Institute for Mathematics and its Applications
`(IMA), University of Minnesota, 2012.; Workshop on Scientific Data Mining, Institute for Pure and
`Applied Mathematics (IPAM), UCLA, 2002; Workshop on Temporal and Spatial Machine Learning,
`International Conference on Machine Learning (ICML), 2001; Massive Datasets workshop at the 1998
`Neural Information Processing Conference (NIPS).
`
`Research and Training Grants, Contracts and Gifts
`
`78. Fair Risk Predictions for Underrepresented Populations using Electronic Health Records, NIH
`R01AG065330-02S1, Sept 1 2021 to April 30 2022, $167,792, co-investigator, (PI: Judy Zhong, Bio-
`statistics, NYU).
`
`77. Data Science Training and Practices: Preparing a Diverse Workforce via Academic and Industrial
`Partnership, NSF IIS-2123366, Sept 1 2021 to Aug 31 2024, $751,921, Co-principal investigator (PI:
`Babak Shababa, Statistics, UCI).
`
`76. Personalized Risk Predictions with Deep Learning Methods in the Presence of Missing and Biased
`Electronic Health Record Data, NIH R01-LM013344, Aug 6 2021 to May 31 2025, $498,957 (UCI
`portion), Principal Investigator (MPI with Judy Zhong, Biostatistics, NYU).
`
`75. Improving Prediction of Fire Extremes in the GEOS Forecasting System on Daily and Seasonal
`Timescales, NASA, Sept 1 2021 to June 30 2025, $1,040,166, Co-principal investigator (PI: Jim Ran-
`derson, Earth System Sciences, UCI).
`
`74. Addressing the Critical Role of Innate/Adaptive Immunity by Integrating Novel Informatics, Transla-
`tion Technologies and Ongoing Clinical Trial Research, NIH 3UL1TR001414-06S1, Sept 2020 to June
`2021, $ 1,088,735, co-investigator (PI: Dan Cooper, School of Medicine, UCI).
`
`4
`
`
`
`73. Analyzing Information Exchange in Human-Human Dialog using Machine Learning, SAP Innovation
`Center, $124,000, April 1 2020 to March 31 2021, Principal Investigator.
`
`72. Generative Expectation-based Response and Novelty Identification, DARPA/SRI-HR001120C0021,
`$1,596,858, Oct 1 2019 to March 31 2023, Co-investigator (PI: Stephan Mandt, Computer Science,
`UCI).
`
`71. Machine Learning Democratization via a Linked, Annotated, Repository of Datasets, National Science
`Foundation (CCRI: ENS), award number NSF-1925741, $1,792,952, Oct 1 2019 to Sept 30 2022. Co-
`principal investigator (PI: Sameer Singh, Computer Science, UCI).
`
`70. Hybrid Human Algorithm Predictions: Balancing Effort, Accuracy, and Perceived Autonomy, National
`Science Foundation (EAGER: AI-DCL), award number NSF-1927245, $293,923, Aug 15 2019 to Aug
`14 2021. Co-principal investigator (PI: Mark Steyvers, Cognitive Sciences, UCI).
`
`69. Assessment of Machine Learning Algorithms in the Wild, National Science Foundation, award number
`NSF-1900644, $1,199,898, Oct 1 2019 to Sept 30 2023, Principal Investigator.
`
`68. Qualcomm Faculty Award, $225,000 (gift), May 2019/March 2022, Principal Investigator.
`
`67. Innovation Center for Advancing Ecosystem Climate Solutions, California Strategic Growth Council,
`award number CCR20021, $4,604,140, 4/01/2019 to 3/31/2022, co-investigator (PI: Mike Goulden,
`Earth Systems Sciences, UCI).
`
`66. Hands-free Documentation in Clinical Practice, SAP, $172,000 (gift/sponsored project), October 2018,
`co-Principal Investigator (with Kai Zheng, Department of Informatics, UCI).
`
`65. TRIPODS-X: Data Science Frontiers in Climate Science, National Science Foundation, award number
`NSF-1839336, $300,000, Oct 1 2018 to Sept 30 2021, co-PI (PI: Efi Foufoula-Georgiou, Civil and
`Environmental Engineering, UCI).
`
`64. Large-Scale Classification Algorithms, eBay Labs, $30,000 (gift), Dec 1 2017, Principal Investigator.
`
`63. Center for Machine Learning and Intelligent Systems, Cylance, $50,000 (gift), Dec 1 2017, Principal
`Investigator.
`
`62. Development of Computational Methods for Evaluating Patient-Doctor Communication, PCORI,
`$270,000 (UCI portion), award number ME-1602-34167, July 1 2017 to June 30th 2019, co-Investigator
`(PI: Zac Imel, U Utah).
`
`61. NRT-DESE: Team Science for Integrative Graduate Training in Data Science and Physical Science,
`NSF, award number NSF-1633631, Sep 15 2016 to Aug 31 2021, $2,967,150, Principal Investigator.
`
`60. Learning Individual Predictive Choice Models, Adobe Research Award, $50,000, October 2016, Princi-
`pal Investigator.
`
`59. Transformative Computational Infrastructures for Cell-Based Biomarker Diagnostics, NIH, award num-
`ber U01TR001801-01, 09/01/16
`08/31/21, $766,000 (UCI portion), co-Investigator (PI: Richard
`Scheuermann, Venter Institute/UCSD).
`
`58. The Big DIPA: Data Image Processing and Analysis, NIH BD2K Program, award number
`1R25EB022366-01, $486,000, Sept 30 2015 to June 30th 2018, co-Investigator (UCI PI: Charless
`Fowlkes).
`
`57. Investigating Virtual Learning Environments, National Science Foundation, award number NSF-
`1535300, $2,500,000, Oct 1 2015 to Sept 30th 2020, co-Investigator (UCI PI: Mark Warschauer).
`
`56. Forensic Science Center of Excellence, National Institute of Standards and Technology (NIST), award
`number 70NANB15H176, $20,000,000 ($4,000,000 for UC Irvine), Oct 1 2015 to Sept 30th 2020, co-
`Investigator (UCI PI: Hal Stern).
`
`5
`
`
`
`55. Data-Intensive Research and Education Center in Science, Technology, Engineering, and Mathematics
`(DIRECT-STEM), NASA MIRO program, award number NNX15AQ06A, $5,000,000 ($1,250,000 for
`UC Irvine), Sept 1 2015 to Aug 31st 2020, Principal Investigator.
`
`54. Analyzing Individual Event Data over Time, Google Faculty Research Award, $60,000, March 2014,
`Principal Investigator.
`
`53. Peer Assessment and Academic Achievement in a Gateway MOOC, Bill and Melinda Gates Foundation,
`Oct 1 2013, $25,000, Co-Investigator (PI: Mark Warschauer, UC Irvine).
`
`52. Statistical Learning Algorithms for Micro-Event Time Series Data, National Science Foundation, award
`number IIS-1320527, Oct 1 2013 to Sept 30th 2018, $499,880, Principal Investigator.
`
`51. Balancing the Portfolio: Efficiency and Productivity of Federal Biomedical R&D Funding, National Sci-
`ence Foundation, award number 1158699, Aug 15 2012 to July 31 2015, $297,331, Principal Investigator
`(original PI, David Newman).
`
`50. Location-based Social Media for Context-based Analysis of Transportation Data, Xerox UAC Research
`Award, Jan 1st 2013 to Dec 31st 2015, $90,000 gift, Principal Investigator.
`
`49. Collaborative Research, Type 1: Decadal Prediction and Stochastic Simulation of Hydroclimate over
`Monsoonal Asia, US Department of Energy, award number DOE SC0006619, Sept 1st 2011 to August
`31st 2014, $180,000, Co-Investigator (PI: Andrew Robertson, Columbia University).
`
`48. Copernicus: System for Foresight and Understanding from Scientific Exposition, IARPA, contract
`number D11PC20155, September 2011 to August 2016, $1,097,420, Principal Investigator.
`
`47. Probabilistic Alignment and Distributed Analytics, IARPA/AFRL FA8650-10-C-7060, Oct 1 2010 to
`Dec 31 2011, $334,537, Principal Investigator.
`
`46. Biomedical Informatics Training Program (supplement), award number NIH LM07443-10S1, 7/1/10-
`6/30/11, $153,485, Senior Personnel (PI: Pierre Baldi, UC Irvine).
`
`45. Automating Behavioral Coding via Text-Mining and Speech Signal Processing, National Institutes of
`Health, award number R01AA018673, $3.1 million, (UC Irvine portion is $953,952), Sept 1 2010 to
`August 31 2015, Co-Investigator (PI: David Atkins, University of Washington).
`
`44. UC Irvine Clinical Translational Science Center, National Institutes of Health, award number
`UL1RR031985, $7,075,320 awarded to date, July 1 2010 to March 31st 2015, Senior Personnel (PI:
`Dan Cooper, UC Irvine).
`
`43. Scaling Statistical Topic Modeling Algorithms to Massive Data Sets, Yahoo! Faculty Research (FREP)
`award, $10,000 gift, May 2010, Principal Investigator.
`
`42. Scalable Methods for the Analysis of Network-based Data, Office of Naval Research: Multidisciplinary
`University Research Initiative (MURI) Award), award number N00014-08-1-1015, $5,381,300, May 1
`2008 to April 30 2013, Principal Investigator.
`
`41. Scaling Statistical Topic Modeling Algorithms to Massive Data Sets, Google Research Award, $60,000,
`April 2008, Principal Investigator.
`
`40. Research in Cyber-Fraud Detection and Prevention, gift from Experian, Inc., $200,000, February 2008,
`Co-Principal Investigator with Michael Goodrich.
`
`39. Collaborative Research: Regional Climate-Change Projections Through Next-Generation Empirical and
`Dynamical Models, Department of Energy, Scientific Discovery through Advanced Computing: Climate
`Change Prediction, award number DE-FG02-07ER64429, $360,000, Oct 1 2007 to Sept 30 2010, Prin-
`cipal Investigator.
`
`6
`
`
`
`38. CRI: Collaborative Research: Improving Experimental Computer Science with a Searchable Web Portal
`for Datasets, National Science Foundation, award number CNS-0551510, $400,000, March 15, 2006 to
`February 28, 2009, Co-Principal Investigator with Andrew McCallum (University of Massachusetts).
`
`Institutes of Health,
`37. Functional Biomedical Informatics Research Network (FBIRN), National
`U24RR021992, $23,992,092, from February 8th 2006 to November 30th 2010, Senior Personnel (PI:
`Steven Potkin, UC Irvine).
`
`36. Characterizing ITCZ Dynamics and Breakdown using Statistical Learning Methods and Satellite Data,
`National Science Foundation, award number ATM-0530926, $618,000, 10/1/2005 to 9/30/2008, Co-
`Investigator (PI: Gudrun Magnusdottir, UC Irvine).
`
`35. UC Irvine Knowledge Discovery Evaluation Challenge Project, Entity Analytics Division, International
`Business Machines (IBM), $73,430, 7/15/05 to 12/31/05, Principal Investigator.
`
`34. Bringing Probabilistic Text Mining Techniques to Historical Document Collections: An Early American
`Case Study, UCI CORCLR Award MI-05-06-14, $18,080, 7/1/2005 - 6/30/2006, Co-Investigator (PI:
`Sharon Block, UC Irvine).
`
`1-P20-RR020837-01, total award is
`33. Transdisciplinary Imaging Genetics Center, NIH Grant No.
`$1,724,026, 9/28/04 to 7/31/07, Co-Investigator (PI: Steven Potkin, UC Irvine).
`
`32. National Alliance for Medical Image Computing (NAMIC), National Institutes of Health, award number
`NIH U54 EB005149, total UCI award is $609,253 from 9/17/04 to 8/31/06, Co-Investigator (PI: Ron
`Kikinis, Brigham and Women’s Hospital).
`
`31. Morphometry Biomedical Informatics Research Network (MBIRN), National Institutes of Health, U24-
`RR021382, total UCI award is $579,880 from 9/30/04 to 5/31/06, Co-Investigator (PI: Bruce Rosen,
`Massachusetts General Hospital).
`
`30. Studies of regional-scale climate variability and change: Hidden Markov models and coupled ocean-
`atmosphere modes, funded by the Climate Change Prediction Program, US Department of Energy,
`October 1st 2004 to September 30th 2007, Principal Investigator.
`
`29. Statistical Data Mining of Time-Dependent Data with Applications in Geoscience and Biology, NSF-IIS-
`0431085, National Science Foundation, $566,644, October 1st 2004 to September 30th 2007, Principal
`Investigator.
`
`28. NSF-ITR: Responding to the Unexpected, Information Technology Research (ITR) program, National
`Science Foundation, $9,480,928, award number NSF-ITR-0331707, October 1st 2003 to September 30th
`2008, Co-Investgator (PI: Sharad Mehrotra, UC Irvine).
`
`27. NSF-ITR: The OptIPuter, Information Technology Research (ITR) program, National Science Foun-
`dation, award number , $13,500,000, October 1st 2002 to September 30th 2007, Co-Investigator (PI:
`Larry Smarr, UCSD).
`
`26. Biomedical Informatics Training Program, National Institutes of Health and National Library of
`Medicine, award number T15-LM-07443, $8,840,297, July 1st 2002 to June 30th 2012, Senior Per-
`sonnel (PI: Pierre Baldi, UC Irvine).
`
`25. Predicting Coupled Ocean-Atmosphere Modes With A Climate Modeling Hierarchy, US Department
`of Energy: Climate Change Prediction Program, $396,000, February 1st 2002 to January 31st 2005,
`Co-Investigator (with Andrew Robertson and Michael Ghil, UCLA).
`
`24. Intelligent Time-Series Pattern Matching, Jet Propulsion Laboratory, June 15th to September 30th
`2002, $80,920, Principal Investigator.
`
`23. Preclinical Detection and Disease Measurement of Alzheimer’s Disease and Related Disorders Using
`EEG, Psychophysical and Data Mining Methods, Alzheimer’s Association of America, September 1st
`2001 to August 30th 2003, $250,000, Co-Investigator (PI: Rod Shankle, UC Irvine).
`
`7
`
`
`
`22. Spatial Data Mining for Massive Scientific Data Sets, Lawrence Livermore National Laboratory, May
`1st 2001 to August 31st 2002, $100,000, Principal Investigator.
`
`21. IBM Faculty Partnership Award, gift from IBM Watson Research Center, May 18th 2001, $40,000,
`Principal Investigator.
`
`20. Data Mining of Digital Behavior, NSF-IIS-0083489, Principal Investigator:
`• Original award: September 15th 2001 to August 30th 2004, $425,000.
`• Supplemental award: September 1st 2003 to December 31st 2010, $1,816,750.
`
`19. Predictive Models for Cancer Detection and Therapy, November 1st 2000 to October 31st 2001, Uni-
`versity of California, Irvine, Cancer Research Grants, $14,301, Co-Investigator (PI: Christine McLaren,
`UC Irvine).
`
`18. Probabilistic Clustering of Dynamic Trajectories for Scientific Data Mining, Institute for Scientific
`Computer Research, Lawrence Livermore National Laboratory, October 1 2000 to September 30 2001,
`$39,178, Renewal: October 1 2001 to September 30 2002, $28,448, Principal Investigator.
`
`17. Sequential Data Analysis for Biomedical Applications, UCI CORCLR Program, July 1 2000 to June
`30th 2001, $12,000, Co-Investigator (PI: Christine McLaren, UC Irvine).
`
`16. Spatio-Temporal Data Mining of Scientific Trajectory Data, Lawrence Livermore National Laboratory,
`March 1st to September 30th 2000, $42,937, Principal Investigator.
`
`15. Research in Data Mining, gift from Microsoft Research, October 1999, $60,000, Principal Investigator.
`
`14. Data Mining of Multivariate Time-Series Sensor Data for Semiconductor Manufacturing,
`NIST/National Semiconductor corporation, April 1 1999 through Dec 31 2001, $162,000, Principal
`Investigator.
`
`13. Clustering of Sequences and Time Series, HNC Software, Inc, $40,913, January 1 1999 through Dec
`31 1999, Principal Investigator.
`
`12. SGER: An Online Repository of Large Data Sets for Data Mining Research and Experimentation,
`National Science Foundation, NSF IIS-9813584, Aug 15, 1998 to January 31, 2000, $99,737, Principal
`Investigator.
`
`11. Data Mining of High-Dimensional Structure-Activity Data Sets, from SmithKline Beecham Research,
`September 1st 1998 to April 1st 1999, $22,730, Principal Investigator.
`
`10. Graduate Fellowships in Biomedical Computing, US Department of Education, $750,000. Sept 1, 1997
`to August 31, 2001, Co-Investigator (PI: Lubomir Bic, UC Irvine).
`
`9. A Distributed Biomedical Computing Laboratory, National Science Foundation (CISE Research Instru-
`mentation), NSF-9617349, co-investigator with L. Bic et al. (University of California, Irvine), March
`1 1997 to February 1 1998, $69,986. Co-Investigator.
`
`8. Turbo-Decoding of High Performance Error-Correcting Codes via Belief Propagation, AFOSR, grant
`F49620-97-1-0313, May 1 1997 to December 31 1998, $300,000. Co-Investigator (PI: Robert McEliece,
`Caltech).
`
`7. Automated Cloud Screening for Remote Exploration and Experimentation (REE) Applications to the
`Earth Orbiting-1 (EO-1) Satellite and Similar Platforms, the Jet Propulsion Laboratory, June 16th
`1997 to November 15th 1997, $34,601, Principal Investigator.
`
`6. Exploring QSAR Data using Probabilistic Data Mining, SmithKline Beecham Research, July 1st to
`December 31st 1997, $35,048, Principal Investigator.
`
`8
`
`
`
`5. Probabilistic Knowledge Discovery and Data Mining: An Integrated Approach at the Interface of Com-
`puter Science and Statistics, National Science Foundation (CAREER award), NSF-9703120, September
`1st 1997 to August 31st 2001, $304,379, Principal Investigator.
`
`4. Clustering and Mode Classification of Engineering Time Series Data, Jet Propulsion Laboratory, June
`15th 1996 to October 17th 1996, $34,401, Principal Investigator.
`
`3. Automated Detection of Natural Features in SAR Images, Jet Propulsion Laboratory Director’s Discre-
`tionary Fund, January 1st 1994 to December 31st 1994, $140,000, Co-Investigator with Usama Fayyad
`(JPL) and Pietro Perona (Caltech).
`
`2. Using Information Theory to Discover Patterns in Databases, Lew Allen Award research grant, Jet
`Propulsion Laboratory. January 1st 1994 to December 31st 1995, $25,000, Principal Investigator.
`
`1. An Information-Theoretic Approach to Distributed Inference and Learning, AFOSR, and ONR. Original
`award AFOSR-90-0199, February 1st 1990 to May 30th 1992, $338,161. Continuation award NOOO14-
`92-J-1860: July 1st 1992 to March 30th 1995, $394,118. Co-Investigator (PI: Rodney Goodman,
`Caltech).
`
`Publications List
`
`Books and Conference Proceedings
`
`B5 A. Nicholson and P. Smyth (eds.), Uncertainty in Artificial Intelligence: Proceedings of the 29th Con-
`ference, ISBN 978-0-9749039-9-6, AUAI Press, Corvallis, OR, 2013.
`
`B4 C. Apte, J. Ghosh, P. Smyth (eds.), Proceedings of the 17th ACM SIGKDD International Conference
`on Knowledge Discovery and Data Mining, ISBN 978-1-4503-0813-7, ACM Press, New York, NY, 2011.
`
`B3 Modeling the Internet and the Web: Probabilistic Methods and Algorithms, P. Baldi, P. Frasconi, and
`P. Smyth, John Wiley, June 2003.
`
`B2 Principles of Data Mining, D. Hand, H. Mannila, and P. Smyth, Cambridge, MA: MIT Press, 2001.
`
`B1 Advances in Knowledge Discovery and Data Mining, U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and
`R. Uthurasamy (eds.), Palo Alto, CA: AAAI/MIT Press, 1996.
`
`Journal Papers
`
`J85 Y. Chen, S. Hantson, N. Andela, S. Coffield, C. Graff, D. Morton, L. Ott, E. Foufoula-Georgiou, P.
`Smyth, M. Goulden, J. Randerson, ‘California wildfire spread derived using VIIRS satellite observations
`and an object-based tracking system,’ Scientific Data, to appear, 2022.
`
`J84 M. Steyvers, H. Tejeda, G. Kerrigan, P. Smyth, ‘Bayesian modeling of human-AI complementarity,’
`Proceedings of the National Academy of Sciences, 119(11):1-7, March 2022
`
`J83 H. Do, S. Nandi, P. Putzel, P. Smyth, J. Zhong, ‘A joint fairness model with applications to risk
`predictions for under-represented populations,’ Biometrics, 2022.
`
`J82 A. Mamalakis, J. T. Randerson, J-Y Yu, M. Pritchard, G. Magnusdottir, P. Smyth, P. A. Levine, S.
`Yu, E. Foufoula-Georgiou, ‘Zonally contrasting shifts of the tropical rainbelt in response to climate
`change,’ Nature Climate Change, https://doi.org/10.1038/s41558-020-00963-x, 11: 143151, January
`2021.
`
`J81 Park, J., Jindal, A., Kuo, P., Tanana, M., Elston Lafata, J., Tai-Seale, M., Atkins, D. C., Imel, Z.
`E., Smyth, P, ‘Automated rating of patient and physician emotion in primary care visits,’ Patient
`Education and Counseling, https://doi.org/10.1016/j.pec.2021.01.004, 2021.
`
`9
`
`
`
`J80 A, Stevens, R. Willett, A. Mamalakis, E. Foufoula-Georgiou, A. Tejedor, J. Randerson; P. Smyth,
`S. Wright.,
`‘Graph-guided regularized regression of Pacific Ocean climate variables to increase
`predictive skill of southwestern US winter precipitation,’ Journal of Climate, 34(2), 737–754,
`https://doi.org/10.1175/JCLI-D-20-0079.1, 2021.
`
`J79 Y. Chen, J. T. Randerson, S. R. Coffield, E. Foufoula-Georgiou, P. Smyth, C. A. Graff, D. C. Morton,
`N. Andela, G. R. van der Werf, L. Giglio, L. E. Ott, ‘Forecasting global fire emissions on sub-seasonal
`to seasonal (S2S) timescales,’ Journal of Advances in Modeling Earth Systems, 12(9), e2019MS001955,
`doi:/10.1029/2019MS001955, 2020.
`
`J78 C. Galbraith, P. Smyth, H. S. Stern, ‘Statistical methods for the forensic analysis of geolocated event
`data,’ Forensic Science International, https://doi.org/10.1016/j.fsidi.2020.301009, 33:1–12, July 2020.
`
`J77 C. Galbraith, P. Smyth, H. Stern, ’Quantifying the association between discrete event time series with
`applications to digital forensics,’ Journal of the Royal Statistical Society A, 183(3):1005–1027, 2020.
`
`J76 C. A. Graff, S. R. Coffield, Y. Chen, E. Foufoula-Georgiou, J. T. Randerson, P. Smyth, ‘Forecasting
`daily wildfire activity using Poisson regression,’ IEEE Transactions on Geoscience and Remote Sensing,
`58(7):4837–4851, 2020.
`
`J75 R. Baker, D. Xu, J. Park, R. Yu, Q. Li, B. Cung, C. Fischer, F. Rodriguez, M. Warschauer, P.
`Smyth, ‘The benefits and caveats of clickstream data to understand student self-regulatory behaviors:
`opening the black box of learning processes,’ International Journal of Educational Technology in Higher
`Education, 17(13):1–24, 2020.
`
`J74 D. Ji, P. Putzel, Y. Qian, I. Chang, A. Mandava, R. H. Scheuermann, J. D. Bui, H-Y Wang, P. Smyth,
`‘Machine learning of discriminative gate locations for clinical diagnosis,’ Cytometry A: Special Issue:
`Machine Learning for Single Cell Data, 97(3):296–307, 2020.
`
`J73 C. Fischer, Z. Pardos, R. Baker, J. J. Williams, P. Smyth, R. Yu, S. Slater, R. Baker, M. Warschauer,
`Mining big data in education: Affordances and challenges, Review of Research in Education, 44(1):130-
`160, 2020.
`
`J72 S. Coffield, C. Graff, Y. Chen, P. Smyth, E. Foufoula-Georgiou, J. Randerson, ‘Machine learning to
`predict final fire size at the time of ignition,’ International Journal of Wildland Fire, 28(11):861–873,
`2019.
`
`J71 J. Park, D. Kotzias, P. Kuo, R. L. Logan, K. Merced, S. Singh, M. Tanana, E. Karra-Taniskidou, J.
`Elston Lafata, D. C. Atkins, M. Tai-Seale, Z. E. Imel, and P. Smyth, ‘Detecting conversation topics
`in primary care office visits from transcripts of patient-provider interactions,’ Journal of the American
`Medical Informatics Association (JAMIA), 26(12):1493–1504, 2019.
`
`J70 D. Kotzias, M. Lichman, and P. Smyth, ‘Predicting consumption patterns with repeated and novel
`events,’ IEEE Transactions on Knowledge and Data Engineering, 31(2), 371-384, 2018.
`
`J69 J. R. Hipp, C. Bates, M. Lichman, and P. Smyth, ‘Using social media to measure temporal ambient
`population: does it help explain local crime rates?’ Justice Quarterly, 36(4), 714-748, March 2018.
`
`J68 C. Galbraith and P. Smyth, ‘Analyzing user-event data using score-based likelihood ratios with marked
`point processes,’ Journal of Digital Investigation, 22, 106-114, 2017.
`
`J67 T. Holsclaw, A. M. Greene, A. W. Robertson, P. Smyth, ‘Bayesian non-homogeneous Markov mod-
`els via Polya-Gamma data augmentation with applications to rainfall modeling’, Annals of Applied
`Statistics, 11(1):393–426, 2017.
`
`J66 G. Gaut, M. Steyvers, Z. E. Imel, D. C. Atkins, P. Smyth, ‘Content coding of psychotherapy transcripts
`using labeled topic models,’ IEEE Journal of Biomedical and Health Informatics, 21(2):476–487, 2017.
`
`10
`
`
`
`J65 C. Haffke, G. Magnusdottir, D. Henke, P. Smyth, Y. Peings, ‘Daily states of the March-April east
`Pacific ITCZ in three decades of high-resolution satellite data,’ Journal of Climate, doi:10.1175/JCLI-
`D-15-0224.1, 29(8):2981-2995, 2016.
`
`J64 P. Arnesen, T. Holsclaw, P. Smyth, ‘Bayesian detection of changepoints in finite-state Markov chains
`for multiple sequences,’ Technometrics, doi:10.1080/00401706.2015.1044118, 58(2), 205-213, 2016.
`
`J63 T. Hoslclaw, A. Greene, A. R. Robertson, P. Smyth, ‘A Bayesian hidden Markov model of daily
`precipitation over South and East Asia,’ Journal of Hydrometeorology, doi:10.1175/JHM-D-14-0142.1,
`17(1):3–25, 2016.
`
`J62 T. Hoslclaw, K. A. Hallgren, M. Steyvers, P. Smyth, D. C. Atkins, ‘Measurement error and outcome
`distributions: Methodological issues in regression analyses of behavioral coding data,’ Psychology of
`Addictive Behaviors, doi:10.1037/adb0000091, 29(4):1031-1040, 2015
`
`J61 M. L. Salmans, Z. Yu, K. Watanabe, E. Cam, P. Sun, P. Smyth, X. Dai, B. Andersen, ‘The co-factor of
`LIM domains (CLIM/LDB/NLI) maintains basal mammary epithelial stem cells and promotes breast
`tumorigenesis,’ PLOS Genetics, July 2014, doi: 10.1371/journal.pgen.100452.
`
`J60 A. J. Frank, P. Smyth, A. T. Ihler, ‘Beyond MAP estimation with the track-oriented multiple hypothesis
`tracker,’ IEEE Transactions on Signal Processing, 62(9):2413–2423, 2014.
`
`J59 D. C. Atkins, M. Steyvers, Z. E. Imel, P. Smyth, ‘Scaling up the evaluation of psychotherapy: evaluating
`motivational interviewing fidelity via statistical text classification,’ Implementation Science, 9:49:1–11,
`2014.
`
`J58 C. DuBois, C. T. Butts, D. McFarland, P. Smyth, ‘Hierarchical models for re