The University of Arizona



Series: Tripods
Location: ENR2 S230
Presenter: Dr. Marek Rychlik, UA Mathematics

Title: OCR, Soft K-Means, CLSTM and RAID

The purpose of this talk is to share several problems and research
directions with the TRIPODS groups at large, in various phases of
progress. We hope to find other researchers involved in similar
research, and identify potential for collaborations.

OCR (optical character recognition): The current thrust of this
project is to develop algorithms for annotating a large collection of
scanned text in the Pashto language (spoken in Afganistan and
Pakistan). This project is a collaboration with Yan Han at Library
Sciences.  The best algorithm is based CLSTM (Context Long-Short Term
Memory), an algorithm which descended from LSTM, used, amongst others,
in Amazon's Alexa.

Soft K-Means and separation of mixtures: The problem of populations
which are "mixtures" of subpopulations goes back to the works of
Pearson.  Soft K-Means is an algorithm closely related to the EM
(Expectation-Minimization) algorithm. Soft K-Means is more suitable
for many problems than the known K-means algorithm.

RAID (Redundant Arrays of Independent disks) is a method of combining
multiple disk drives into one device with better throughput and error
correcting capabilities. This part of my talk is mostly on our
experience commercializing RAID invented by Mohamad Moussa and Marek

Department of Mathematics, The University of Arizona 617 N. Santa Rita Ave. P.O. Box 210089 Tucson, AZ 85721-0089 USA Voice: (520) 621-6892 Fax: (520) 621-8322 Contact Us © Copyright 2018 Arizona Board of Regents All rights reserved