This thesis is aimed at finding solutions and statistical modeling techniques to analyze the video content in a way such that intelligent and efficient interaction with video is possible. In our work, we investigate several fundamental tasks for content analysis of video. Specifically, we propose an outline video parsing algorithm using basic statistical measures and an off-line solution using Independent Component Analysis (ICA). A spatiotemporal video similarity model based on dynamic programming is developed. For video object segmentation and tracking, we develop a new method based on probabilistic fuzzy c-means and Gibbs random fields. Theoretically, we develop a generic framework for sequential data analysis. The new framework integrates both Hidden Markov Model and ICA mixture model. The re-estimation formulas for model parameter learning are also derived. As a case study, the new model is applied to golf video for semantic event detection and recognition.