Skip to content

The Big Data Blog

A science blog about my spare time data analysis projects.

Menu
  • All Articles
  • Fundamentals
  • Coding
  • App
  • Curriculum vitae
  • Impressum

Category: All Articles

kernel smoother (1).onnx (1)
All Articles / Coding / Introduction / Python / Python

Efficient Kernel Smoother in the ONNX Format

Efficient data processing lies at the heart of modern machine learning. Kernel smoothing, a versatile non-parametric technique, benefits significantly from performance improvements through optimized algorithms and frameworks. In this post, …

poisson pmf
All Articles / Fundamentals / Introduction

Offset and Weights in GLM Regression

Introduction In GLMs, we often encounter scenarios where we need to account for exposure or adjust for certain factors. Both offset and weights play crucial roles in achieving this. Let’s …

image
All Articles / Coding / Finance / Large Language Models / Python

Unlocking Stock Market Insights with RiskBERT

Predicting stock prices is akin to the attempts of alchemists in the Middle Ages to transmute lead into gold. Just as alchemists sought to unlock the secrets of transformation, statisticians …

All Articles / Introduction

Notes on stochastic count processes with independent non stationary increments

Problem Statement Let thus , we construct a stochastic process as , this kind of random variable are called hypoexponential random variables. We define a counting process such that . …

All Articles / Coding / Large Language Models / Python

Generalized Semantic Regression using Contextual Embeddings

In many applications actuaries, data scientists or researchers are confronted with datasets like shown in Table 1. While nominal variables, as in the first or second column, can be used …

All Articles / Finance / Introduction

Expected maxima of a Brownian Motion- Does my stock trading strategy work?

I recently realized that I have a certain habit when it comes to stock trading even though I used to tell myself that “every trading strategy is useless because the …

All Articles

Estimating the extrema of noisy curves and optimization using spline surface approximation

Estimating maxima and minima of a noisy curve turns out to be very hard and to a large part is still an open question. In this blog post we will discover some strategies …

All Articles / Introduction

A WordPress Plugin to embed raw.githubusercontent

Working at my post about the dask cluster, I realized that the code snippets presented in the post will eventually change in my GitHub Repo. I wanted to avoid having …

All Articles / Cluster / Kubernetes / Python

Building a minimal, cost efficient Dask cluster

In this article we will show a way to do high performance parallel computing at a Kubernetes cluster using task. A primary focus is that we want to archive the …

All Articles / Coding / Fundamentals / Introduction

Frequency-Severity Modeling in consideration of COVID-19 induced effects

This post is supposed to give a brief introduction in Frequency-Severity models. These models are very popular for determine the optimal price for an insurance. We will take a look …

All Articles / Coding / Python

Kernel Regression using the Fast Fourier Transform

1. Setup In a previous post it was shown how to speed up the computation of a kernel density using the Fast Fourier Transform. Conceptually a kernel density is not …

All Articles / Hardware

Cluster Monitoring using a ST7789 Display

In the articles Kuberentes at an OrangePi and Setting up the OrangePi it was described how I build my toy cluster. Meanwhile the cluster received some updates. Two more nodes …

All Articles / Coding / Python

Fast Kernel Density Estimation using the Fast Fourier Transform

1. Setup This Post is about how to speed up the computation kernel density estimators using the FFT (Fast Fourier Transform). Let be be a random sample drawn from an …

All Articles / Coding / Fundamentals / Python / Spark

Non-Linear Classification Methods in Spark

In a previous post I covered how to apply classical linear estimators like support vector machines or logistic regression to a non-linear dataset using the kernel method. This article can …

All Articles / Fundamentals / Hardware

Kubernetes at an OrangePi

1. Install k3s at the OrangePI In a previous article I explained how to get spark running at an OrangePi to create a toy computing-cluster.  If you look at this …

All Articles / Coding / Introduction / JavaScript

An AI with less than 200 lines of code

In the last two articles we covered the topics “How to teach a computer gamerules” and “The Multiarmed Bandit Problem”. Indeed these two articles where intended to be an introduction …

All Articles / Coding / JavaScript

Teaching a Computer Gamerules

Sequential games with perfect information 1. A very short course in Game Theory In the twenties people start to describe games using math. Since then Game Theory becomes an important …

All Articles / Coding / Fundamentals / Introduction / JavaScript

Solving the Multiarmed Bandit problem with JavaScript

1. Formulation of the Multiarmed Bandit Problem Consider the following problem: A gambler enters a casino with slot machines. The probability to receive a reward for each slot machine follows …

All Articles / Fundamentals / Introduction

An introduction to the Registration Problem

To explain the registration problem i will start with an example. In Figure 1 the pinch force dataset is shown, to collect the data a group of 20 subjects were …

All Articles / Coding / Fundamentals / Python / Spark

Non-Linear Support Vector Machines (SVM)

1. Introduction This blog post is about Support Vector Machines (SVM), but not only about SVMs. SVMs belong to the class of classification algorithms and are used to separate one …

Posts navigation

Older posts

Recent Posts

  • Efficient Kernel Smoother in the ONNX Format
  • Offset and Weights in GLM Regression
  • Unlocking Stock Market Insights with RiskBERT
  • Notes on stochastic count processes with independent non stationary increments
  • Generalized Semantic Regression using Contextual Embeddings

Recent Comments

  • Efficient Kernel Smoother in the ONNX Format - The Big Data Blog on Kernel Regression using the Fast Fourier Transform
  • Unlocking Stock Market Insights with RiskBERT - The Big Data Blog on Generalized Semantic Regression using Contextual Embeddings
  • Bastian on Expected maxima of a Brownian Motion- Does my stock trading strategy work?
  • Anders Munk-Nielsen on Fast Kernel Density Estimation using the Fast Fourier Transform
  • Heiko Wagner on Fast Kernel Density Estimation using the Fast Fourier Transform

Archives

  • March 2025
  • July 2024
  • February 2024
  • September 2023
  • July 2023
  • January 2023
  • August 2022
  • April 2022
  • January 2022
  • April 2021
  • February 2021
  • October 2020
  • June 2020
  • March 2020
  • January 2020
  • September 2019
  • July 2019
  • June 2019
  • April 2019
  • February 2019
  • September 2018
  • August 2018
  • December 2017
  • October 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017

Categories

  • All Articles
  • Cluster
  • Coding
  • Finance
  • Functional Data Analysis with Spark
  • Fundamentals
  • Hardware
  • Install Spark on a OrangePi PC
  • Introduction
  • JavaScript
  • Kubernetes
  • Large Language Models
  • Matlab
  • Passwords
  • Projects
  • Python
  • Python
  • Spark

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
Visit me on social media
Copyright © 2025 The Big Data Blog – OnePress theme by FameThemes