OPAC

検索結果一覧に戻る

The data science handbook / Field Cady

データ種別	電子ブック
出版者	(Hoboken, NJ : John Wiley & Sons, Inc)
出版年	2017
大きさ	1 online resource
著者標目	*Cady, Field 1984- author

所蔵情報を非表示

URL	巻　次	配架場所	請求記号	登録番号	コメント	刷　年	状　態	利用注記	ISBN	予約	請求メモ
URL		射水-電子	007	EB0005185	Wiley Online Library: Complete oBooks				9781119092933

書誌詳細を非表示

一般注記	Includes bibliographical references and index Print version record and CIP data provided by publisher Giving extensive coverage to computer science and software engineering since they play such a central role in the daily work of a data scientist, this comprehensive book provides a crash course in data science, combining all the necessary skills into a unified discipline. -- Edited summary from book Cover -- Title Page -- Copyright -- Dedication -- Contents -- Preface -- Chapter 1 Introduction: Becoming a Unicorn -- 1.1 Aren't Data Scientists Just Overpaid Statisticians? -- 1.2 How Is This Book Organized? -- 1.3 How to Use This Book? -- 1.4 Why Is It All in Python™, Anyway? -- 1.5 Example Code and Datasets -- 1.6 Parting Words -- Part 1 The Stuff You'll Always Use -- Chapter 2 The Data Science Road Map -- 2.1 Frame the Problem -- 2.2 Understand the Data: Basic Questions -- 2.3 Understand the Data: Data Wrangling -- 2.4 Understand the Data: Exploratory Analysis -- 2.5 Extract Features -- 2.6 Model -- 2.7 Present Results -- 2.8 Deploy Code -- 2.9 Iterating -- 2.10 Glossary -- Chapter 3 Programming Languages -- 3.1 Why Use a Programming Language? What Are the Other Options? -- 3.2 A Survey of Programming Languages for Data Science -- 3.3 Python Crash Course -- 3.4 Strings -- 3.5 Defining Functions -- 3.6 Python's Technical Libraries -- 3.7 Other Python Resources -- 3.8 Further Reading -- 3.9 Glossary -- Interlude: My Personal Toolkit -- Chapter 4 Data Munging: String Manipulation, Regular Expressions, and Data Cleaning -- 4.1 The Worst Dataset in the World -- 4.2 How to Identify Pathologies -- 4.3 Problems with Data Content -- 4.4 Formatting Issues -- 4.5 Example Formatting Script -- 4.6 Regular Expressions -- 4.7 Life in the Trenches -- 4.8 Glossary -- Chapter 5 Visualizations and Simple Metrics -- 5.1 A Note on Python's Visualization Tools -- 5.2 Example Code -- 5.3 Pie Charts -- 5.4 Bar Charts -- 5.5 Histograms -- 5.6 Means, Standard Deviations, Medians, and Quantiles -- 5.7 Boxplots -- 5.8 Scatterplots -- 5.9 Scatterplots with Logarithmic Axes -- 5.10 Scatter Matrices -- 5.11 Heatmaps -- 5.12 Correlations -- 5.13 Anscombe's Quartet and the Limits of Numbers -- 5.14 Time Series -- 5.15 Further Reading -- 5.16 Glossary Chapter 6 Machine Learning Overview -- 6.1 Historical Context -- 6.2 Supervised versus Unsupervised -- 6.3 Training Data, Testing Data, and the Great Boogeyman of Overfitting -- 6.4 Further Reading -- 6.5 Glossary -- Chapter 7 Interlude: Feature Extraction Ideas -- 7.1 Standard Features -- 7.2 Features That Involve Grouping -- 7.3 Preview of More Sophisticated Features -- 7.4 Defining the Feature You Want to Predict -- Chapter 8 Machine Learning Classification -- 8.1 What Is a Classifier, and What Can You Do with It? -- 8.2 A Few Practical Concerns -- 8.3 Binary versus Multiclass -- 8.4 Example Script -- 8.5 Specific Classifiers -- 8.6 Evaluating Classifiers -- 8.7 Selecting Classification Cutoffs -- 8.8 Further Reading -- 8.9 Glossary -- Chapter 9 Technical Communication and Documentation -- 9.1 Several Guiding Principles -- 9.2 Slide Decks -- 9.3 Written Reports -- 9.4 Speaking: What Has Worked for Me -- 9.5 Code Documentation -- 9.6 Further Reading -- 9.7 Glossary -- Part II Stuff You Still Need to Know -- Chapter 10 Unsupervised Learning: Clustering and Dimensionality Reduction -- 10.1 The Curse of Dimensionality -- 10.2 Example: Eigenfaces for Dimensionality Reduction -- 10.3 Principal Component Analysis and Factor Analysis -- 10.4 Skree Plots and Understanding Dimensionality -- 10.5 Factor Analysis -- 10.6 Limitations of PCA -- 10.7 Clustering -- 10.8 Further Reading -- 10.9 Glossary -- Chapter 11 Regression -- 11.1 Example: Predicting Diabetes Progression -- 11.2 Least Squares -- 11.3 Fitting Nonlinear Curves -- 11.4 Goodness of Fit: R2 and Correlation -- 11.5 Correlation of Residuals -- 11.6 Linear Regression -- 11.7 LASSO Regression and Feature Selection -- 11.8 Further Reading -- 11.9 Glossary -- Chapter 12 Data Encodings and File Formats -- 12.1 Typical File Format Categories -- 12.2 CSV Files -- 12.3 JSON Files -- 12.4 XML Files 17.5 Smoothing Signals -- 17.6 Logarithms and Other Transformations -- 17.7 Trends and Periodicity -- 17.8 Windowing -- 17.9 Brainstorming Simple Features -- 17.10 Better Features: Time Series as Vectors -- 17.11 Fourier Analysis: Sometimes a Magic Bullet -- 17.12 Time Series in Context: The Whole Suite of Features -- 17.13 Further Reading -- 17.14 Glossary -- Chapter 18 Probability -- 18.1 Flipping Coins: Bernoulli Random Variables -- 18.2 Throwing Darts: Uniform Random Variables -- 18.3 The Uniform Distribution and Pseudorandom Numbers -- 18.4 Nondiscrete, Noncontinuous Random Variables -- 18.5 Notation, Expectations, and Standard Deviation -- 18.6 Dependence, Marginal and Conditional Probability -- 18.7 Understanding the Tails -- 18.8 Binomial Distribution -- 18.9 Poisson Distribution -- 18.10 Normal Distribution -- 18.11 Multivariate Gaussian -- 18.12 Exponential Distribution -- 18.13 Log-Normal Distribution -- 18.14 Entropy -- 18.15 Further Reading -- 18.16 Glossary -- Chapter 19 Statistics -- 19.1 Statistics in Perspective -- 19.2 Bayesian versus Frequentist: Practical Tradeoffs and Differing Philosophies -- 19.3 Hypothesis Testing: Key Idea and Example -- 19.4 Multiple Hypothesis Testing -- 19.5 Parameter Estimation -- 19.6 Hypothesis Testing: t-Test -- 19.7 Confidence Intervals -- 19.8 Bayesian Statistics -- 19.9 Naive Bayesian Statistics -- 19.10 Bayesian Networks -- 19.11 Choosing Priors: Maximum Entropy or Domain Knowledge -- 19.12 Further Reading -- 19.13 Glossary -- Chapter 20 Programming Language Concepts -- 20.1 Programming Paradigms -- 20.2 Compilation and Interpretation -- 20.3 Type Systems -- 20.4 Further Reading -- 20.5 Glossary -- Chapter 21 Performance and Computer Memory -- 21.1 Example Script -- 21.2 Algorithm Performance and Big-O Notation -- 21.3 Some Classic Problems: Sorting a List and Binary Search 21.4 Amortized Performance and Average Performance -- 21.5 Two Principles: Reducing Overhead and Managing Memory -- 21.6 Performance Tip: Use Numerical Libraries When Applicable -- 21.7 Performance Tip: Delete Large Structures You Don't Need -- 21.8 Performance Tip: Use Built-In Functions When Possible -- 21.9 Performance Tip: Avoid Superfluous Function Calls -- 21.10 Performance Tip: Avoid Creating Large New Objects -- 21.11 Further Reading -- 21.12 Glossary -- Part III Specialized or Advanced Topics -- Chapter 22 Computer Memory and Data Structures -- 22.1 Virtual Memory, the Stack, and the Heap -- 22.2 Example C Program -- 22.3 Data Types and Arrays in Memory -- 22.4 Structs -- 22.5 Pointers, the Stack, and the Heap -- 22.6 Key Data Structures -- 22.7 Further Reading -- 22.8 Glossary -- Chapter 23 Maximum Likelihood Estimation and Optimization -- 23.1 Maximum Likelihood Estimation -- 23.2 A Simple Example: Fitting a Line -- 23.3 Another Example: Logistic Regression -- 23.4 Optimization -- 23.5 Gradient Descent and Convex Optimization -- 23.6 Convex Optimization -- 23.7 Stochastic Gradient Descent -- 23.8 Further Reading -- 23.9 Glossary -- Chapter 24 Advanced Classifiers -- 24.1 A Note on Libraries -- 24.2 Basic Deep Learning -- 24.3 Convolutional Neural Networks -- 24.4 Different Types of Layers. What the Heck Is a Tensor? -- 24.5 Example: The MNIST Handwriting Dataset -- 24.6 Recurrent Neural Networks -- 24.7 Bayesian Networks -- 24.8 Training and Prediction -- 24.9 Markov Chain Monte Carlo -- 24.10 PyMC Example -- 24.11 Further Reading -- 24.12 Glossary -- Chapter 25 Stochastic Modeling -- 25.1 Markov Chains -- 25.2 Two Kinds of Markov Chain, Two Kinds of Questions -- 25.3 Markov Chain Monte Carlo -- 25.4 Hidden Markov Models and the Viterbi Algorithm -- 25.5 The Viterbi Algorithm -- 25.6 Random Walks -- 25.7 Brownian Motion John Wiley and Sons Wiley Online Library: Complete oBooks HTTP:URL=https://onlinelibrary.wiley.com/doi/book/10.1002/9781119092919
件　名	LCSH:Databases -- Handbooks, manuals, etc 全ての件名で検索 LCSH:Statistics -- Data processing -- Handbooks, manuals, etc 全ての件名で検索 LCSH:Big data -- Handbooks, manuals, etc 全ての件名で検索 LCSH:Information theory -- Handbooks, manuals, etc 全ての件名で検索 CSHF:Statistique -- Informatique -- Guides, manuels, etc 全ての件名で検索 CSHF:Donn�ees volumineuses -- Guides, manuels, etc 全ての件名で検索 CSHF:Th�eorie de l'information -- Guides, manuels, etc 全ての件名で検索 FREE:COMPUTERS -- Databases -- General 全ての件名で検索 FREE:Big data FREE:Databases FREE:Information theory FREE:Statistics -- Data processing 全ての件名で検索 MESH:Handbook FREE:handbooks FREE:Handbooks and manuals FREE:Handbooks and manuals FREE:Guides et manuels
分　類	LCC:QA76.9.D32 DC23:005.74
書誌ID	EB00004475
ISBN	9781119092933