Machine Learning for Audio, Image and Video Analysis

von Francesco Camastra Alessandro Vinciarelli

Theory and Applications

Preis unbekannt

Buch in deiner Nähe kaufen

oder

Beschreibung

Machine Learning involves several scientific domains including mathematics, computer science, statistics and biology, and is an approach that enables computers to automatically learn from data. Focusing on complex media and how to convert raw data into useful information, this book offers both introductory and advanced material in the combined fields of machine learning and image/video processing.

The machine learning techniques presented enable readers to address many real world problems involving complex data. Examples covering areas such as automatic speech and handwriting transcription, automatic face recognition, and semantic video segmentation are included, along with detailed introductions to algorithms and examples of their applications.

The book is organized in four parts: The first focuses on technical aspects, basic mathematical notions and elementary machine learning techniques. The second provides an extensive survey of most relevant machine learning techniques for media processing, while the third part focuses on applications and shows how techniques are applied in actual problems. The fourth part contains detailed appendices that provide notions about the main mathematical instruments used throughout the text.

Students and researchers needing a solid foundation or reference, and practitioners interested in discovering more about the state-of-the-art will find this book invaluable. Examples and problems are based

on data and software packages publicly available on the web.

This book illustrates how to deal with complex media and convert raw data into useful information. Students and researchers needing a solid foundation or reference, and practitioners interested in discovering more about the state-of-the-art will find this book invaluable.

1. 1 TwoFundamentalQuestions There are two fundamental questions that should be answered before buying, and even more before reading, a book: • Why should one read the book? • What is the book about? This is the reason why this section, the ?rst of the whole text, proposes some motivations for potential readers (Section 1. 1. 1) and an overall description of the content (Section 1. 1. 2). If the answers are convincing, further information can be found in the rest of this chapter: Section 1. 2 shows in detail the str- ture of the book, Section 1. 3 presents some features that can help the reader to better move through the text, and Section 1. 4 provides some reading tracks targeting speci?c topics. 1. 1. 1 Why Should One Read The Book? One of the most interesting technological phenomena in recent years is the di?usion of consumer electronic products with constantly increasing acqui- tion, storage and processing power. As an example, consider the evolution of digital cameras: the ?rst models available in the market in the early nineties produced images composed of 1. 6 million pixels (this is the meaning of the expression 1. 6 megapixels), carried an onboard memory of 16 megabytes, and had an average cost higher than 10,000 U. S. dollars. At the time this book is being written, the best models are close to or even above 8 megapixels, have internal memories of one gigabyte and they cost around 1,000 U. S. dollars.

Provides detailed introductions to algorithms and examples of their applications

Domains that appear far from one another such as speech and handwriting recognition are shown to be equivalent from the processing point of view, via the unifying framework of machine learning

Supplies detailed appendices reviewing the basic background

Provides pointers to publicly available data and software packages used in examples and problems

Machine learning enables computers to automatically "learn" from data. It involves several disciplines, including mathematics, computer science, statistics, and biology. Focusing on complex media and the conversion of raw data into useful information, this book offers both introductory and advanced material in the fields of machine learning and image/video processing. The book is organized in four parts. The first part focuses on technical aspects, basic mathematical notions, and elementary machine learning techniques. The second provides an extensive survey of the relevant machine learning techniques for media processing. The third part focuses on applications, and the fourth contains appendices that provide detailed information on the mathematical instruments used in the book. Examples and problems throughout the book, based on data and software packages publicly available on the web, help readers apply machine learning to address real-world issues involving complex data.

Autor*in

Francesco Camastra

Themen in »Machine Learning for Audio, Image and Video Analysis«

Classification Clustering Ensemble methods Face verification HSV Hidden Markov methods Kernel methods MP3 MPEG Speech and handwriting recognition cognition kernel method learning machine learning verification

Stimmen zu »Machine Learning for Audio, Image and Video Analysis«

From the reviews:

"A book that focuses on the intersection and intersection of these two fast-growing areas could not be better timed. … the book is organized into three major parts that cover audio and video processing, machine learning, and applications. … On the whole, this is a valuable and timely reference book for those interested in machine learning or audio, video, and image processing, although the need for a well-integrated book on this topic still remains." (M. Sasikumar, ACM Computing Reviews, December, 2008)

"…this book, unlike most other books in this field, not only introduces a few widely used techniques in audio and image analysis, but also discusses the latest advancements in the field. …Distinct from other books, it also points out several public software packages and benchmark data sets that encourage the reader to have a hands-on experience on how machine-learning techniques work to analyze audio and visual content. Its comprehensive coverage on recent development in this research area makes it easy for experienced researchers to further explore the latest techniques. …it is ideal as a textbook or supplemental material for senior graduate courses or advanced topic seminars." (Jie Yu, Journal of Electronic Imaging, Vol. 18, Apr–Jun 2009)

()

Details

ISBN: 9781848000063

Verlag: Springer London

Erscheinung: 03.12.2007