Skip to content

The Big Data Blog

A science blog about my spare time data analysis projects.

Menu
  • All Articles
  • Fundamentals
  • Coding
  • App
  • Curriculum vitae
  • Impressum

Category: Projects

Coding / Introduction / Passwords / Python / Python

Guessing Passwords using an LSTM Network

Introduction A while ago I applied NLP strategies to implement an algorithm that is capable to guess a password. Since then a new method called transformer was developed and successfully …

All Articles / Coding / Functional Data Analysis with Spark / Projects / Python / Spark

Kernel Regression using Pyspark

1. Kernel Regression using Pyspark In a previous article I presented an implementation of a kernel denisty estimation using pyspark. It is thus not difficult to modify the algorithm to …

All Articles / Coding / Functional Data Analysis with Spark / Python / Spark

Nonparametric Density estimation using Spark

1. A Nonparametric Density implementation in Spark One of my previous blog post concerns about nonparametric density estimation. In this post i presented some Matlab code. An advantage of this …

All Articles / Coding / Functional Data Analysis with Spark / Python / Spark

Functional Regression with Spark

1. Functional Regression Let the covariate be an at least twice continuously differentiable random function defined wlog. on an interval and the corresponding the response. For simplicity we assume centered …

All Articles / Coding / Functional Data Analysis with Spark / Python / Spark

Functional Principal Component Analysis with Spark

1.) Functional Principal Component Analysis Let be a centered smooth random function in , with finite second moment . Without loss of generality we assume instead of some arbitrary compact …

All Articles / Passwords / Projects

3. A more sophisticated approach using Markov chains.

1. Generalize the Procedure The very simple Educated Guess Procedure is not only very simple, the procedure is also very unrealistic. At most the assumption that are independent and identically …

All Articles / Coding / Passwords / Projects / Python

2. Coding the “Educated Guess Procedure”

1. Perform the Analyze To start with, we load the “rockyou.txt.tar.gz” password list using wget. I’m not sure if it is legal to provide a link for the list, therefore …

All Articles / Coding / Install Spark on a OrangePi PC / Projects / Python

5. Running some tests

1. Test the Enviroment 1.1 Simulation of a Brownian Motion The purpose of the first notebook entry is to check if matplotlib is correctly installed. We simulate 20 Brownian Motions …

All Articles / Install Spark on a OrangePi PC / Projects

4. Install IPython Notebook for Remote Access and Hive

1. Requirements 2. Install Software In this section we will install some stuff which will make life easier. In constrast to Spark or Hadoop it is only required to install …

All Articles / Install Spark on a OrangePi PC / Projects

3. Build the Cluster

1. Requirements We need an SD Card with Lubuntu, Hadoop and Spark installed. 2. Build the Cluster 2.1 Clone the SD Card sudo shutdown 0 of your orangepi and remove …

All Articles / Install Spark on a OrangePi PC / Projects

2. Install Hadoop and Spark

1. Requirements An Orangepi with Lubuntu running, see this post for further instructions. 2.Install the Components 2.1 Update Java In fact Hadoop is not necessary for Spark. However, we will …

All Articles / Hardware / Install Spark on a OrangePi PC / Projects

1. Setting up the OrangePi

1. Requirements OrangePI PC SD Card (larger than 8GB) 4.0*1.7 Powersuppy or USB Cable and Charger (min. 2A) WLAN Adapter or Ethernet Cable HDMI Cable and Monitor (only for setup) …

All Articles / Passwords / Projects

1. Thougts about Passwords

1. Introduction This Project is about making an educated guess to derive an unknown password based on hacked password lists (Google: “RockYou”). In this section we will introduce the mathematical …

Recent Posts

  • Efficient Kernel Smoother in the ONNX Format
  • Offset and Weights in GLM Regression
  • Unlocking Stock Market Insights with RiskBERT
  • Notes on stochastic count processes with independent non stationary increments
  • Generalized Semantic Regression using Contextual Embeddings

Recent Comments

  • Efficient Kernel Smoother in the ONNX Format - The Big Data Blog on Kernel Regression using the Fast Fourier Transform
  • Unlocking Stock Market Insights with RiskBERT - The Big Data Blog on Generalized Semantic Regression using Contextual Embeddings
  • Bastian on Expected maxima of a Brownian Motion- Does my stock trading strategy work?
  • Anders Munk-Nielsen on Fast Kernel Density Estimation using the Fast Fourier Transform
  • Heiko Wagner on Fast Kernel Density Estimation using the Fast Fourier Transform

Archives

  • March 2025
  • July 2024
  • February 2024
  • September 2023
  • July 2023
  • January 2023
  • August 2022
  • April 2022
  • January 2022
  • April 2021
  • February 2021
  • October 2020
  • June 2020
  • March 2020
  • January 2020
  • September 2019
  • July 2019
  • June 2019
  • April 2019
  • February 2019
  • September 2018
  • August 2018
  • December 2017
  • October 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017

Categories

  • All Articles
  • Cluster
  • Coding
  • Finance
  • Functional Data Analysis with Spark
  • Fundamentals
  • Hardware
  • Install Spark on a OrangePi PC
  • Introduction
  • JavaScript
  • Kubernetes
  • Large Language Models
  • Matlab
  • Passwords
  • Projects
  • Python
  • Python
  • Spark

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
Visit me on social media
Copyright © 2025 The Big Data Blog – OnePress theme by FameThemes