The single lens model

# Run this cell before beginning the exercises.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy.optimize import minimize
import os
from typing import Tuple, Callable, Optional, List
from math import isclose
from io import StringIO
import time
import sys

# dev note : needs proofreading

The single lens model#

To determine the configuration of an event's lens, one must essentially consider models of all possible lens-source configurations and see which ones "match" the observed lightcurve. The collective wisdom of the microlensing community allows for some short cuts in this process. For example, it is atypical to need to consider configuration options beyond a single-lens system with a single source star (1L1S), a single-lens system with two contributing source stars (1L2S), or a binary-lens system with a single source star (2L1S). Considering a higher number of lens objects before thoroughly investigating single- and binary-lensing options has been a pitfall of microlensing in its infancy (KMT-2021-BLG-0322, Han et al, 2021; OGLE-2013-BLG-0723, Han et al., 2016 and Udalski et al., 2015; MACHO 97-BLG-41, Jung et al., 2013 and Albrow et al., 2000). Additionally, past experience has informed the notion that dramatic perturbations on the Paczynski curve^*₁ are caused by caustic crossings, which only exist in multiple-lens events.

^*₁ The Paczynski curve refers to the shape of a simple microlensing event in which both the lens and the source are single (not binaries) and effectively point-like.

This section discusses the first point of call for microlensing event modelling; the 1L1S model and the it's parameterisations.

In addition to this standard model, there are higher-order effects of situational importance, which are discussed in the Higher Order notebook. The additional effects discussed in that notebbok include:

lens orbital motion,

parallax,

xallarap,

finite source, and

stellar variability.

The 1L1S model can be parameterised by $\Theta=(u_0, t_0, t_{\rm E})$, where $u_0$ is the minimum angular separation between the source star and lens mass in units of angular Einstein radius ($\theta_{\rm E}$),^_2,3 $t_0$ is the time when the source is at position $u_0$, and $t_{\rm E}$ is the time taken for the source star to travel the angular distance $\theta_{\rm E}$. The parameter $t_{\rm E}$ is therefore used in describing the relative angular "velocity," where $$ \mu_{rel}=\frac{\theta_{\rm E}}{t_{\rm E}}, $$ and $\mu_{rel}$ is referred to as the relative proper motion. In this parameterisation, the source's relative trajectory is described by $$ u(t)=(\tau^2+u_0^2)^{(1/2)}, $$ where $\tau=\frac{(t-t_0)}{t_{\rm E}}$.₄ For simplicity's sake, microlensing models are usually described in the reference frame of the lens system. This means that, for modelling purposes, the relative velocities of the source and lens systems are attributed to the "source velocity." The total magnification, as a function of the relative position of the source, is described by $$ A(u)=\frac{(u^2+2)}{(u\sqrt{(u^2+4)})}. $$

^*₂ The formulaic description of $\theta_{\rm E}$ can be found in the Introduction. We also spent time in this notebook building an intuition as to how different lens and source distances and lens masses affect $\theta_{\rm E}$.

^*₃ The derivation of $\theta_E$ is shown here.

^*₄The derivation of the magnification as a function of $u$ is shown here.

When applying this magnification curve as a model for a lightcurve, the contribution of the source flux, $F_{\rm S}$, and blended flux, $F_{\rm B}$^*₅ must also be determined through linear regression, because the total observed flux for each point in a 1L1S lightcurve is described by $$ F_i=A_i\times F_{\rm S}+F_{\rm B}; $$ the lensing effect only magnifies flux from the source.

^*₅ Blended flux comes from other stars, in the same line of sight, that are attributed to the event's PSF during photometry. This blend flux may be entirely due to the lens, or not at all (in the case of dim lenses, like blackholes)

Finding which $F_{\rm S}$ and $F_{\rm B}$ best generate the data ($\mathbf{F}$), for a given magnification model ($\mathbf{A}$), can be achieved using linear regression. Linear regression is a method used to calculate which model parameters minimise the summed square "distance" of data from a linear model, scaled by the data uncertainties. This minima occurs where the partial differentials of the summed, square distances, scaled by the data uncertainties, equate to zero; i.e., $$ \frac{\partial\chi^2}{\partial F_{\rm S}} = 0, $$ $$ \frac{\partial\chi^2}{\partial F_{\rm B}} = 0. $$

Linear regression considers only the uncertainty-scaled, $y$ distance (flux, in this example) and therefore it is only an appropriate choice of regression method if the $x$ positions are accurate and precise (i.e., negligible $x$ errors). Note that a linear model does not mean a model that is a linear line. For example, $ \mathbf{F}=\mathbf{A}\times F_{\rm S}+F_{\rm B} $ is linear even when $\mathbf{A}$ is not.

$\chi^2$ is metric used to describe how likely a parameter set is to have generated the observed data, for a given model. $\chi^2$ requires assumptions of Gaussian-distributed $y$ errors and negligible $x$ errors. It is defined, for this model, as follows: $$ \chi^2 = \sum\limits_i \frac{\left(F_i - x_i\right)^2} {\sigma_i^2}, $$ where $x_i$ are the photometric, lightcurve data and ${\sigma_i^2}$ are the data uncertainties. The maximum-likelihood solution $(F_{\rm S}, F_{\rm B})$ can therefore be found by solving the following simultaneous equations: $$ \sum\limits_i \frac{A_i(A_iF_{\rm S}+F_{\rm B}-x_i)} {\sigma_i^2} = 0 $$ $$ \sum\limits_i \frac{(A_iF_{\rm S}+F_{\rm B}-x_i)} {\sigma_i^2} = 0 $$

This problem can be expressed in matrix form as $$ \begin{matrix} \begin{bmatrix} \sum\limits_{i}\frac{A_{i}^{2}}{\sigma_{i}^{2}} & \sum\limits_{i}\frac{A_{i}}{\sigma_{i}^{2}} \ \sum\limits_{i}\frac{A_{i}}{\sigma_{i}^{2}} & \sum\limits_{i}\frac{1}{\sigma_{i}^{2}} \end{bmatrix} & \times & \begin{bmatrix} F_{\rm S} \ F_{\rm B} \end{bmatrix} &{}={}& \begin{bmatrix} \sum\limits_{i}\frac{A_{i}x_{i}}{\sigma_{i}^{2}} \ \sum\limits_{i}\frac{x_{i}}{\sigma_{i}^{2}} \end{bmatrix} \ \mathbf{B} &\times& \mathbf{\Theta} &=& \mathbf{C}. \end{matrix} $$

This is solved as follows: $$ \mathbf{\Theta} = \mathbf{B}^{-1} \mathbf{C} = \frac{adj,\mathbf{B}}{\det\mathbf{B}} \cdot \mathbf{C}. $$

Exercise 1

Write you own point-source 1S1L magnification model function.

def SL_magnification(u0:float, t0:float, tE:float, t:np.ndarray) -> np.ndarray:
    '''
    Write your doc strings.
    '''
    ######################
    # EXERCISE: SingleLensE1.txt
    #--------------------- 
    tau = (t-t0)/tE
    u = (tau**2 + u0**2)**(1/2)
    A = (u**2 + 2) / (u*(u**2+4)**(1/2))
    ######################

    
    return A

It is good practice to follow this style guide when coding in Python. This tutorial goes through it all, efficiently. Astronomy modules, written in Python, typically follow the Numpydoc style for doc string.

Lets generate a fake data set using your function.

def generate_lightcurve(
                        SL_magnification: Callable[[float, float, float, np.ndarray], np.ndarray], 
                        u0: float, 
                        t0: float, 
                        tE: float, 
                        FS: float, 
                        FB: float, 
                        sig: np.ndarray, 
                        t: Optional[np.ndarray] = None, 
                        trange: Optional[Tuple[float, float]] = None
                        ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
    """
    Generate lightcurve for given parameters and epochs or date range.
    
    Parameters
    ----------
    SL_magnification : Callable[[float, float, float, np.ndarray], np.ndarray]
        Single lens magnification function.
    u0 : float
        Impact parameter.
    t0 : float
        Time of closest approach.
    tE : float
        Einstein radius crossing time.
    FS : float
        Source flux.
    FB : float
        Blend flux.
    sig : np.ndarray
        Standard deviation of Gaussian errors.
    t : Optional[np.ndarray], optional
        Array of epochs (default is None).
    trange : Optional[Tuple[float, float]], optional
        Date range (default is None).
        - tstart : float, start of date range.
        - tfinish : float, end of date range.
    
    Returns
    -------
    t_data : np.ndarray
        Generated time points.
    flux_data : np.ndarray
        Generated flux data with noise.
    flux_err_data : np.ndarray
        Gaussian errors.

    Raises
    ------
    ValueError
        If neither epochs nor date range is provided.
    """
    
    # Generate epochs if not provided
    if t is None:
        if trange is None:
            raise ValueError('Either epochs or date range must be provided.')
        tstart, tfinish = trange
        days = np.arange(np.floor(tstart), np.ceil(tfinish)) # Generate integer days within the range
        t = []
        for day in days:
            if tstart <= day <= tfinish:
                # Generate epochs for the night
                num_epochs = np.random.randint(0, 9)  # Random number of epochs between 0 and 8
                epochs = np.random.choice(np.arange(0.3, 0.6, 0.0104167), size=num_epochs, replace=False)
                t.extend(day + epochs)
    t = np.array(t)
    
    # Generate theoretical flux
    A = SL_magnification(u0, t0, tE, t)
    flux_theoretical = A*FS + FB
    
    # Generate Gaussian noise
    noise = np.random.normal(0, sig, size=t.shape)
    
    # Generate noisy flux data
    flux_data = flux_theoretical + noise
    
    # Gaussian errors
    flux_err_data = np.full(t.shape, sig)
    
    return t, flux_data, flux_err_data
    

# making up event parameters and data stats
sig = 11.2
u0 = 0.4321
t0 = 7892.123
tE = 23.4
FS = 343.21
FB = 231.2

trange = (t0-82, t0+70)

# generating fake data
t_data, flux_data, flux_err_data = generate_lightcurve(SL_magnification, 
                                                       u0, t0, tE, FS, FB, 
                                                       sig, trange=trange)

%matplotlib widget

# Let's start by taking a look at the data
plt.close(1)
plt.figure(figsize=(9,6), num=1)
plt.errorbar(t_data, flux_data, yerr=flux_err_data, fmt='x', color='black', label='data')
plt.xlabel('HJD - 2450000')
plt.ylabel('Flux')
plt.title('Single Lens Generated Lightcurve')
plt.xlim(trange[0], trange[1])
plt.legend()
plt.show()

Below is a function to solve for $F_{\rm S}$ and $F_{\rm B}$.

def calc_Fs(model: np.ndarray, x: np.ndarray, sig2: np.ndarray) -> Tuple[float, float]:
    '''
    Solves for the flux parameters for a given model using least squares.
    
    Parameters
    ----------
    model : np.ndarray
        Model magnification curve.
    x : np.ndarray
        Observed flux values.
    sig2 : np.ndarray
        Flux errors.
    
    Returns
    -------
    FS : float
        Source flux.
    FB : float
        Blend flux.
    '''
    #A
    A11 = np.sum(model**2 / sig2)
    Adiag = np.sum(model / sig2) 
    A22 = np.sum(1.0 / sig2)
    A = np.array([[A11,Adiag], [Adiag, A22]])
     
    #C
    C1 = np.sum((x * model) / sig2)
    C2 = np.sum(x / sig2)
    C = np.array([C1, C2]).T
     
    #B
    B = np.linalg.solve(A,C)
    FS = float(B[0])
    FB = float(B[1])
    
    return FS, FB

Okay, now let's try fitting our lightcurve.

Exercise 2

Make an initial guess at your microlensing parameters (p₀). Don't stress about this too much. It's a guess; it doesn't have to be that good.

t₀ is the time of the peak of the event,
u₀ is somewhere probably between 0 and 1, if there is an observable packzynki curve,
t_E is the width of the Packzynski curve and is typically 10-30 days.

######################
# EXERCISE: SingleLensE2.txt
#---------------------
# Initial guess
u0 = 0.7
t0 = 7889
tE = 30
######################

# magnification model for initial guess chi2
A_0 = SL_magnification(u0, t0, tE, t_data)

# Model
t_model = np.linspace(7750, 8000, 1000)
A_model = SL_magnification(u0, t0, tE, t_model)

p0 = [u0, t0, tE]
FS0, FB0 = calc_Fs(A_0, flux_data, flux_err_data**2)

# Plot the data and guess
plt.close(2)
plt.figure(figsize=(9,6), num=2)
plt.errorbar(t_data, flux_data, yerr=flux_err_data, fmt='x', color='black', label='data')
plt.plot(t_model, FS0*A_model + FB0, color='red', label='Initial Guess')
plt.xlabel('HJD - 2450000')
plt.ylabel('Flux')
plt.title('Single Lens Generated Lightcurve')
plt.xlim(7750, 8000)
plt.legend()
plt.show()

Exercise 3

Write an objective function for the fit.

def chi2(
         p: List[float], 
         t: np.ndarray, 
         f: np.ndarray, 
         sig: np.ndarray,
         SL_magnification: Callable[[float, float, float, np.ndarray], np.ndarray]
         ) -> float:
    '''
    Calculates the chi squared value for a given model and parameters.
    
    Parameters
    ----------
    p : np.ndarray
        Parameters.
    t : np.ndarray
        Data epochs.
    f : np.ndarray
        Observed flux values.
    sig : np.ndarray
        Flux errors.
    
    Returns
    -------
    chi2 : float
        Chi squared value.
    '''

    ######################
    # EXERCISE: SingleLensE3.txt
    #--------------------- 
    sig2 = sig**2
    model = SL_magnification(p[0], p[1], p[2], t)
    FS, FB = calc_Fs(model, f, sig2)
    chi2_value = np.sum(((FS*model + FB - f) / sig2)**2)
    ######################

    return chi2_value

# Minimize the chi squared value
result = minimize(
                  chi2, 
                  p0, 
                  args=(t_data, flux_data, flux_err_data, SL_magnification), 
                  method='Nelder-Mead'
                  )
popt = result.x
print(result)

# Model
A_model_opt = SL_magnification(
                               popt[0], 
                               popt[1], 
                               popt[2], 
                               t_model
                               )  # model line for best fit parameters
A_opt = SL_magnification(
                         popt[0], 
                         popt[1], 
                         popt[2], 
                         t_data
                         )  # model flux values for best fit parameters
FS_opt, FB_opt = calc_Fs(
                         A_opt, 
                         flux_data, 
                         flux_err_data**2
                         )  # best fit flux parameters

# Plot
plt.close(3)
plt.figure(figsize=(9,6), num=3)

plt.errorbar(
             t_data, 
             flux_data, 
             yerr=flux_err_data, 
             fmt='x', 
             color='black', 
             label='data'
             )
plt.plot(
         t_model, 
         FS0*A_model + FB0, 
         color='red', 
         label='Initial Guess', 
         alpha=0.3
         )
plt.plot(
         t_model, 
         FS_opt*A_model_opt + FB_opt, 
         color='red', 
         label='Best Fit'
         )

plt.xlabel('HJD - 2450000')
plt.ylabel('Flux')
plt.title('Single Lens Generated Lightcurve')
plt.xlim(7750, 8000)
plt.legend()
plt.show()

       message: Optimization terminated successfully.
       success: True
        status: 0
           fun: 4.779488004039568
             x: [ 4.423e-01  7.892e+03  2.300e+01]
           nit: 127
          nfev: 234
 final_simplex: (array([[ 4.423e-01,  7.892e+03,  2.300e+01],
                       [ 4.423e-01,  7.892e+03,  2.300e+01],
                       [ 4.423e-01,  7.892e+03,  2.300e+01],
                       [ 4.423e-01,  7.892e+03,  2.300e+01]]), array([ 4.779e+00,  4.779e+00,  4.779e+00,  4.779e+00]))

Great job!

Did you get fit parameters similar to those you put in to the generate_lightcurve() function?

If not, maybe you should take another look at you $\chi^2$ function.

If your fit looks good, let try and quantify how much better it was than the initial guess. Given the assumptions we mentioned earlier out Gaussianity and on-dimensional data uncertainties, the relative probability that our model produced the data is

$$ \mathcal{L} = e^{-\Delta\chi^2/2}. $$

Exercise 4

How much more probable is it that your fitted parameters produced the lightcurve than your initial guess parameters?

######################
# EXERCISE: SingleLensE4.txt
#---------------------
chi2_0 = chi2(p0, t_data, flux_data, flux_err_data, SL_magnification)
chi2_opt = chi2(popt, t_data, flux_data, flux_err_data, SL_magnification)

L0 = np.exp(-0.5*chi2_0)
Lopt = np.exp(-0.5*chi2_opt)

print(f'Likelihood ratio: {Lopt/L0:.4e}')
######################

Likelihood ratio: 6.7429e+17

Let try that again with some real data.

# Magnitude to flux and flux to magnitude functions

def mag2flux(
             mag: np.ndarray, 
             zp: Optional[float] = 25.0
            ) -> np.ndarray:
    '''
    Converts magnitude values (array) to flux.
    
    Parameters
    ----------
    mag : np.ndarray
        Array of magnitudes.
    zp : float, optional
        Zero point of the magnitude system (default is 25.0).

    Returns
    -------
    f : np.ndarray
        Array of fluxes.
    '''
    f = 10.0**((mag - zp) / -2.5)
    return f

def flux2mag(
             f: np.ndarray,
             zp: Optional[float] = 25.0,
            ) -> np.ndarray:
    '''
    Converts flux values (array) to magnitude.

    Parameters
    ----------
    f : np.ndarray
        Array of fluxes.
    zp : float, optional
        Zero point of the magnitude system (default is 25.0).

    Returns
    -------
    mag : np.ndarray
        Array of magnitudes.
    '''

    mag = -2.5 * np.log10(f) + zp
    return mag

def mag2flux_err(
                 mag: np.ndarray, 
                 mag_err: np.ndarray, 
                 zp: Optional[float] = 25.0
                 ) -> np.ndarray:
    '''
    Converts magnitude values and errors (array) to flux using error propagation formula.
    
    Parameters
    ----------
    mag : np.ndarray
        Array of magnitudes.
    mag_err : np.ndarray
        Array of magnitude errors.
    zp : float, optional
        Zero point of the magnitude system (default is 25.0).

    Returns
    -------
    flux_err : np.ndarray
        Array of flux errors.

    Notes
    -----
    for a function $f(x)$, the error in $y$ is given by
    $$ dy = \sqrt{\left(\frac{dy}{dx}\right)^2 dx^2} $$.
    '''
    dfdmag = -0.4 * np.log(10) * 10.**(0.4*(zp-mag))
    flux_err = np.sqrt(dfdmag**2 * mag_err**2)
    return flux_err

def flux2mag_err(
                 f: np.ndarray, 
                 f_err: np.ndarray, 
                ) -> np.ndarray:
    '''
    Converts flux values and errors (array) to magnitude using error propagation formula.
    
    Parameters
    ----------
    f : np.ndarray
        Array of fluxes.
    f_err : np.ndarray
        Array of flux errors.

    Returns
    -------
    flux_err : np.ndarray
        Array of flux errors.

    Notes
    -----
    for a function $f(x)$, the error in $y$ is given by
    $$ dy = \sqrt{\left(\frac{dy}{dx}\right)^2 dx^2} $$.
    '''
    dmagdf = - 2.5 / (f * np.log(10))
    mag_err = np.sqrt(dmagdf**2 * f_err**2)
    return mag_err

# Data
data_dir = os.path.abspath('./Data/Events/OB170002/')

def read_data(data_dir: str, 
              extension: Optional[List[str]] = ['dat','diapl','pysis','txt']
              ) -> dict:
    '''
    Reads in the data from a directory.
    
    Parameters
    ----------
    dir : str
        Directory path.
    
    Returns
    -------
    data : dict
        Dictionary containing the data.
        Keys are observatory names, and values are tuples of (t_data, flux_data, flux_err_data).
            t_data : np.ndarray
                Time data.
            mag_data : np.ndarray
                Magnitude data.
            mag_err_data : np.ndarray
                Magnitude error data.
    '''
    data = {} # initalizing data dictinary

    # list of all the 'dat' files in data_directory
    files = [f for f in os.listdir(data_dir) if f.split('.')[-1] in extension] 

    print(files)
    
    files.sort() # alphabetise the data files

    for f in files:

        # creating the path to each data file
        if data_dir[-1] == '/':
            datafile = data_dir + f
        else:
            datafile = data_dir + '/' + f

        # Reading in the data
        t_data = np.loadtxt(datafile, usecols=0)
        if t_data[0] > 2450000:
            t_data = t_data - 2450000
        mag = np.loadtxt(datafile, usecols=1)
        mag_err = np.loadtxt(datafile, usecols=2)

        # This only works because there is only one KMT field (42)
        if 'OGLE' in f:  # (HJD, mag, mag_err, quality?, sky counts?)
            observatory = 'OGLE'
            flux_data = mag2flux(mag, zp=28.)
            flux_err_data = mag2flux_err(mag, mag_err, zp=28)
        if 'KMT' in f:
            flux_data = mag2flux(mag, zp=28.65)
            flux_err_data = mag2flux_err(mag, mag_err, zp=28.65)
            
            if 'KMTA' in f:  # (HJD, mag, mag_err)
                observatory = 'KMT-A'
            elif 'KMTC' in f:
                observatory = 'KMT-C'
            elif 'KMTS' in f:
                observatory = 'KMT-S'

        data[observatory] = (t_data, flux_data, flux_err_data)
    
    return data

data = read_data(data_dir)

print(data)

['KMTC42_I.txt', 'OGLE-2017-BLG-0002.txt', 'KMTA42_I.txt', 'KMTS42_I.txt']
{'KMT-A': (array([7806.24415, 7806.25732, 7806.26931, ..., 7937.1415 , 7937.16061,
       7937.18001], shape=(1107,)), array([17684.79378282, 19989.41111236, 19230.9172891 , ...,
       15332.04199035, 15417.0045295 , 16998.08513203], shape=(1107,)), array([354.1075771 , 361.40651699, 321.30165513, ..., 461.20272052,
       363.79339135, 390.76914175], shape=(1107,))), 'KMT-C': (array([7805.87179, 7805.88926, 7806.859  , ..., 7938.52161, 7938.55645,
       7938.57627], shape=(1096,)), array([19751.49622061, 19860.94917357, 19037.06531742, ...,
       14832.00967182, 15417.0045295 , 14804.71324898], shape=(1096,)), array([326.54281547, 321.94993952, 267.91623672, ..., 316.93022935,
       302.73517188, 304.07487934], shape=(1096,))), 'KMT-S': (array([7808.59108, 7808.60346, 7808.61802, ..., 7937.56534, 7937.59386,
       7938.23454], shape=(1543,)), array([17881.3374869 , 17930.8137708 , 18586.60202689, ...,
       14955.46772744, 15332.04199035, 14242.95164973], shape=(1543,)), array([341.73839946, 255.32019627, 260.54955302, ..., 309.09966374,
       370.82619231, 371.90219641], shape=(1543,))), 'OGLE': (array([7059.88047, 7060.87017, 7060.88628, ..., 8049.49982, 8053.51668,
       8054.49895], shape=(2657,)), array([8024.16776309, 9028.17564966, 8218.64080389, ..., 9221.46665124,
       7812.67934552, 7935.96968196], shape=(2657,)), array([369.5265815 , 291.03399735, 287.64662096, ..., 195.34554715,
       431.74461594, 489.72297908], shape=(2657,)))}

If you plot this data, you may find that the KMTNet data are full of "bad data" that are not reasonable and will have an effect on the fit. This kind of data would warrant a more careful re-reduction or filtering based on image quality metrics. In lieu of any such image quality data or access to the raw images, we could remove those data with the greatest uncertainties, to make our plots look nicer for these excercises, but this procedure is considerd bad practice for statistical analysis. Faced with this same problem in your work, you could consider "fitting" the bad data removal so as to marginalize their effect and provide statistical grounds for the pruning selection. Refer to the Pruning Outliers section on page 11 of Hogg, Bovy & Lang (2010).

# Visualizing the data
plt.close(4)
plt.figure(figsize=(9,6), num=4)
colors = ('purple', 'violet', 'plum', 'thistle')

######################
# EXERCISE: SingleLensE5.txt PART: 1
#--------------------- 
# change this to True to remove data with large errors
mask_large_errors = True
######################

for i, key in enumerate(data.keys()):

    if key == 'OGLE':
        zp = 28
    else:
        zp = 28.65
    mag = flux2mag(data[key][1], zp=zp)
    mag_err = flux2mag_err(data[key][1], data[key][2])
    t = data[key][0]

    # Remove data with large errors
    if mask_large_errors:
        error_mask = mag_err < 0.1
    else:
        error_mask = [True] * len(mag)

    plt.errorbar(
                t[error_mask], 
                mag[error_mask], 
                yerr=mag_err[error_mask], 
                fmt='x', 
                color=colors[i], 
                label=key
                )

######################
# EXERCISE: SingleLensE5.txt PART: 2
#--------------------- 
# change this to your guess for t_0
t_0_guess = 7780
plt.vlines(t_0_guess, 16, 20, color='black', linestyle='--', label='t_0')
######################

plt.xlabel('HJD - 2450000')
plt.ylabel('Magnitude (zp = 28)')
plt.title('OB170002')
plt.xlim(7750, 8000)
plt.ylim(20, 16)
plt.legend()
plt.show()

Exercise 5

Fit a point-source 1S1L model to event OB170002.

Note that each data set requires individual F_S and F_B parameters in order to be comparable to the unifying magnification curve, which is borne of the physical, lensing configuration and thus is not data set specific.

Here, "data set" refers to lightcurve data, for a given event, reduced using the same method, from the same observatory and field, imaged in the same band.

######################
# EXERCISE: SingleLensE5.txt PART: 3
#--------------------- 
# Fitting a point-source 1S1L model to OB170002

# new objective function
#--------------------------------
def chi2_many_datasets(
                       p: List[float],  # parameters
                       data: dict,  # data dictionary of (t, f, f_err)
                       SL_magnification: Callable[[float, float, float, np.ndarray], np.ndarray]
                      ) -> float:
    '''
    Calculates the chi-squared value for a point-source 1S1L model.

    Parameters
    ----------
    p : List[float]
        Parameters of the point-source 1S1L model.
    data : dict
        Data dictionary of (t, f, f_err) for each observatory.
    SL_magnification : Callable[[float, float, float, np.ndarray], np.ndarray]
        Magnification function of the point-source 1S1L model.

    Returns
    -------
    chi2 : float
        Chi-squared value.
    '''

    chi2 = 0
    for key in data.keys():
        t, f, f_err = data[key]
        A = SL_magnification(p[0], p[1], p[2], t)
        FS, FB = calc_Fs(A, f, f_err**2)
        F = FS * A + FB
        chi2 += np.sum((f - F)**2 / f_err**2)
    return chi2


# plotting the initial guess
#--------------------------------
plt.close(5)
plt.figure(figsize=(9,6), num=5)
colors = ('purple', 'violet', 'plum', 'thistle')

# Initial guess
t_0_guess = 7793
t_E_guess = 10
u_0_guess = 0.1
p_0 = [u_0_guess, t_0_guess, t_E_guess]

# Mask large errors?
mask_large_errors = False
filtered_data = {}

# plot the initial guess and create a filtered data dictionary
# filterred_data == data, if mask_large_errors == False
# dictionary format: data[obskey] = (t, f, f_err)
for i, key in enumerate(data.keys()):

    if key == 'OGLE':
        zp = 28
    else:
        zp = 28.65

    flux = data[key][1]
    flux_err = data[key][2]
    mag = flux2mag(flux, zp=zp)
    mag_err = flux2mag_err(flux, flux_err)
    t = data[key][0]

    # Remove data with large errors
    if mask_large_errors:
        error_mask = mag_err < 0.1
    else:
        error_mask = [True] * len(mag)

    plt.errorbar(
                t[error_mask], 
                mag[error_mask], 
                yerr=mag_err[error_mask], 
                fmt='x', 
                color=colors[i], 
                label=key,
                zorder=i+1
                )
    
    filtered_data[key] = (t[error_mask], flux[error_mask], flux_err[error_mask])

# fit FS FB based on OGLE data
key = 'OGLE'
zp = 28.0
t_model = np.linspace(7750, 8000, 1000)

A_0_model = SL_magnification(*p_0, t_model)
print(max(A_0_model), max(A_0))
A_0 = SL_magnification(*p_0, filtered_data[key][0])
FS, FB = calc_Fs(A_0, filtered_data[key][1], filtered_data[key][2]**2)
F_model = FS * A_0_model + FB
mag_model = flux2mag(F_model, zp=zp)

plt.plot(t_model, mag_model, 'k:', label='initial guess', zorder=10)


# fit to all the data simultaneously
#--------------------------------
# Minimize the chi squared value
result = minimize(
                  chi2_many_datasets, 
                  p_0, 
                  args=(filtered_data, SL_magnification), 
                  method='Nelder-Mead'
                  )
popt = result.x
print(result)


# plot the fit
#--------------------------------
key = 'OGLE'
zp = 28.0

A_opt = SL_magnification(*popt, filtered_data[key][0])
A_opt_model = SL_magnification(*popt, t_model)
FS, FB = calc_Fs(A_opt, filtered_data[key][1], filtered_data[key][2]**2)
F_opt = FS * A_opt_model + FB
mag_opt = flux2mag(F_opt, zp=zp)

plt.plot(t_model, mag_opt, 'k-', label='fit', lw=1, alpha=0.5, zorder=11)


plt.xlabel('HJD - 2450000')
plt.ylabel('Magnitude (zp = 28)')
plt.title('OB170002')
plt.xlim(7750, 8000)
plt.ylim(20, 16)
plt.legend()
plt.show()
######################

10.028244953849077 1.6785700965092578
       message: Optimization terminated successfully.
       success: True
        status: 0
           fun: 75263.78658318985
             x: [ 2.465e-01  7.793e+03  1.415e+01]
           nit: 184
          nfev: 328
 final_simplex: (array([[ 2.465e-01,  7.793e+03,  1.415e+01],
                       [ 2.465e-01,  7.793e+03,  1.415e+01],
                       [ 2.465e-01,  7.793e+03,  1.415e+01],
                       [ 2.465e-01,  7.793e+03,  1.415e+01]]), array([ 7.526e+04,  7.526e+04,  7.526e+04,  7.526e+04]))

You can find further point-source exercises here.

Free single-lens-fitting codes#

If data fitted to a point source point lens model (PSPL) model shows signs of deviations from the model, a binary lens, binary source, and/or higher-order effect model needs to be considered in an attempt to describe these deviations.

While creating a simple 1S1L function is a fairly trivial process, these more complicated models are not. It is usefull to be familiar with the freely available codes one can employ to fit such models.

Open source single-lens codes:

Name	Notes	Maintained	Link
MuLens	User friendly, single- and binary-lens fitting code	Yes (Poleski)	GitHub
BAGEL	Incorporates photometric and astrometric microlensing	Yes (Moving Universe Lab)	GitHub
VBMicroelensing	A more general version of the binary-lens code VBBL	Yes (Bozza)	GitHub
pyLIMA	Useful for space-based observations	Yes (Bachelet)	GitHub
RTModel	Hands-off model fitting with built in model "interpretation" (e.g. determing single-lens vs binary-lens arrangement)	Yes (Bozza)	Github
muLAn	Designed for Roman data (space-based observations)	No (Cassan/Ranc)	GitHub

If you would like to add your code to this list, please email the corresponding author, details in the ReadMe.md file.

`MulensModel`#

As an example, lets use 'MuLensModel` to create a 1L1S model magnification curves with finite source effects (i.e., a finite-source 1L1S model, or FSPL).

The different options available for methods of calculating the finite-source magnification are described in this document.

First off install the module, if your are not using the provided environment. Execute the following code in your terminal:

(TheGuide) ~/$ pip install MulensModel

If that worked you should be able to run the following code without error. You may need to restart the notebook kernel.

These microlensing codes are not on conda-forge (which requires binary packages of compiled code), so they must be installed from source (e.g., using pip) as opposed to conda install MulensModel.

At the time of this notebbooks last edit, there is a bug in the pip version of MulensModel. There is a simple fix. Run the following cell and the oackage will be corrested, iuf it needs it.

# This cell restores the working directory in case of failure. 
# Make sure you run this cell before running the next cell.
cwd = os.getcwd()
print('current working directory:',cwd)
try:
    os.chdir(_cwd)
except:
    _cwd = cwd
print('working directory reset to:', _cwd)

current working directory: /home/runner/work/TheMicrolensersGuideToTheGalaxy/TheMicrolensersGuideToTheGalaxy/docs/Solved
working directory reset to: /home/runner/work/TheMicrolensersGuideToTheGalaxy/TheMicrolensersGuideToTheGalaxy/docs/Solved

# Fixing MulensModel
import MulensModel

######################
    # EXERCISE: SingleLensE6.txt
    #--------------------- 
    # Magnification curve array
    A = fspl.get_magnification(t)

    # Plot lightcurves (technically, manification curves)
    ax[1].plot(tt, np.log10(A), marker='', ls='-', c=trajectory['color'], 
               label=r'$u_0=%1.2f$' %trajectory['u0'])
    ######################

# Aesthetics
ax[0].grid(True)
ax[0].set_aspect('equal')  # Equal in x and y
#ax[1].set_aspect(6.7)  # manually forcing the magnification curve to be the same 
                        # size as the caustic diagram

ax[0].set_ylabel(r'$u_{2}$')  # axis labels
ax[0].set_xlabel(r'$u_{1}$')
ax[1].set_ylabel(r'$\log_{10}A$')
ax[1].set_xlabel(r'$\tau$')
ax[1].yaxis.set_label_position("right")
ax[1].yaxis.tick_right()

ax[1].legend()  # u0=...

phi = np.linspace(-np.pi,np.pi,200)
r = 1  # theta_E
# r^2 = x^2 + y^2
x = r * np.cos(phi)
y = r * np.sin(phi)
ax[0].plot(x, y, 'k--', alpha=0.3)  # thetaE circle
ax[0].plot(0, 0, marker='o', c='red', ms=np.log10(5000.0))  # lens object

ax[0].set_xlim(xlims[0],xlims[1])  # set plot bounds
ax[0].set_ylim(ylims[0],ylims[1])
ax[1].set_xlim(xlims[0],xlims[1])

# Save
#plt.savefig('/Products/single_trajectories_FS.png', bbox_inches='tight')  # 'tight' removes outer padding

  Cell In[14], line 8
    A = fspl.get_magnification(t)
    ^
IndentationError: unexpected indent

The Mulens class has built in data storage and event grouping objects and a $\chi^2$ calculating function. We could use this functionality to fit a finite source model to OB170002 as seen in the Mulens fitting example.

We can add our data to a MulensData object as follows:

OB170002_KMTC_data = MulensModel.MulensData(data_list = data['KMT-C'], 
                                            plot_properties={'color':'purple', 'label':'KMT-C', 'marker':'x', 'markersize':2}, 
                                            phot_fmt='flux', 
                                            bandpass='I')
OB170002_KMTA_data = MulensModel.MulensData(data_list = data['KMT-A'],
                                            plot_properties={'color':'violet', 'label':'KMT-A', 'marker':'x', 'markersize':2}, 
                                            phot_fmt='flux', 
                                            bandpass='I')
OB170002_KMTS_data = MulensModel.MulensData(data_list = data['KMT-S'],
                                            plot_properties={'color':'plum', 'label':'KMT-S', 'marker':'x', 'markersize':2}, 
                                            phot_fmt='flux', 
                                            bandpass='I')
OB170002_OGLE_data = MulensModel.MulensData(data_list = data['OGLE'],
                                            plot_properties={'color':'thistle', 'label':'OGLE', 'marker':'x', 'markersize':2}, 
                                            phot_fmt='flux', 
                                            bandpass='I')

OB170002_data = [OB170002_KMTC_data, OB170002_KMTA_data, OB170002_KMTS_data, OB170002_OGLE_data]

plt.close(12)
plt.figure(12)
OB170002_OGLE_data.plot()
OB170002_KMTA_data.plot()
OB170002_KMTS_data.plot()
OB170002_KMTC_data.plot()
plt.legend()
plt.ylim(5000, 40000)
plt.title('OB170002')
plt.show()

We can define a finite source model as follows:

u0, t0, tE, rho = 0.2465, 7793.0, 14.15, 0.01 # use the fit value from exercise 5!!
Theta0 = np.array([t0, u0, tE, rho])  # initial guess
labels = np.array(['t_0', 'u_0', 't_E', 'rho']) # parameter labels

OB170002_fspl = MulensModel.Model({'t_0': t0, 'u_0': u0, 't_E': tE, 'rho': rho}) 

# setting the magnification method
OB170002_fspl.set_magnification_methods([2450000., 'finite_source_uniform_Gould94', 2470000.]) # rho <= 0.1

plt.figure(12)
t = np.linspace(7000, 8100, 2000)  # time range
A0 = OB170002_fspl.get_magnification(t)  # initial guess magnification curve
plt.plot(t, A0*8500, label='Initial Guess', color='black', linestyle=':')
plt.show()

We combine these in an Event object.

OB170002 = MulensModel.Event(datasets=OB170002_data, model=OB170002_fspl)

def chi2_fun(theta, parameters_to_fit, event):
    """
    Calculate chi2 for given values of parameters.
    
    Parameters
    ----------
    theta : np.ndarray
        Vector of parameter values, e.g., `np.array([5380., 0.5, 20.])`.
    parameters_to_fit : list of str
        List of names of parameters corresponding to theta, e.g., `['t_0', 'u_0', 't_E']`.
    event : MulensModel.Event
        Event which has datasets for which chi2 will be calculated.
    
    Returns
    -------
    chi2 : float
        Chi2 value for given model parameters.
    
    Notes
    -----
    Function from MulensModel documentation: 
    https://github.com/rpoleski/MulensModel/blob/master/examples/example_02_fitting.py
    """
    # First we have to change the values of parameters in
    # event.model.parameters to values given by theta.
    for (parameter, value) in zip(parameters_to_fit, theta):
        setattr(event.model.parameters, parameter, value)

    # After that, calculating chi2 is trivial:
    return event.get_chi2()

print('Initial Parameter Guess:\n{0}'.format(OB170002.model.parameters))

plt.close(13)
plt.figure(13)

chi20 = chi2_fun(Theta0, labels, OB170002)  # initial guess chi2
OB170002.plot_model(color='k', label=r'Initial Guess ($\chi^2 = %1.3f$)' %chi20, linestyle=':', t_range=[7000, 8100])
OB170002.plot_data()  # MulensModel automatically fits for the source and blend flux for the  
# given model.

plt.legend(loc='best')
plt.title('OB170002')
plt.ylim(12.5, 10)
plt.xlim(7750,8000)
plt.show()

Exercise 7

1. Can you determine the angular size of the source in event OB170002, in units of θ_E?

2. How much did the addition of a finite source parameter in the model increase the relative probabilty of your model generating the observed data?

3. Why does it not make sense for your relative probability to have decreased?

If your probability does decrease with an increase in the complexity of your model, this is an indication that your minimisation method is failing somehow. We refer you to the Modelling notebook for a better understanding of the topic.

######################
# EXERCISE: SingleLensE7.txt PART: 1a
#--------------------- 
# E7.1 Write your model fitting code here

# Nelder-Mead
#--------------------------------
# Minimize chi2_fun using the initial guess
result = minimize(
    chi2_fun,
    Theta0,
    args=(labels, OB170002),
    method='Nelder-Mead'  # or 'Powell', 'L-BFGS-B', etc.
)

# Extract best-fit parameters
best_fit_params = result.x
best_fit_rho = best_fit_params[3]

# uncertainty on rho
# Scan rho around the best-fit value
rho_values = np.linspace(0.0, 1.0, 1000)
chi2_values = []

for rho in rho_values:
    params = best_fit_params.copy()
    params[3] = rho
    chi2 = chi2_fun(params, labels, OB170002)
    chi2_values.append(chi2)

chi2_values = np.array(chi2_values)
chi2_min = np.min(chi2_values)

# Find where chi2 = chi2_min + 1
within_1sigma = np.where(chi2_values < chi2_min + 1)[0]
rho_lower = rho_values[within_1sigma[0]]
rho_upper = rho_values[within_1sigma[-1]]

print(f"Best fit rho (Nelder-Mead) = {best_fit_rho:.5f} (+{rho_upper - best_fit_rho:.5f}, -{best_fit_rho - rho_lower:.5f})")


# MCMC
#--------------------------------
import emcee

def log_prior(theta):
    t0, u0, tE, rho = theta
    if (u0 > 0) and (tE > 0) and (0 < rho < 1.0):
        return 0.0
    return -np.inf

# Log-likelihood: -0.5 * chi2
def log_likelihood(theta, labels, event):
    return -0.5 * chi2_fun(theta, labels, event)

# Log-probability
def log_probability(theta, labels, event):
    lp = log_prior(theta)
    if not np.isfinite(lp):
        return -np.inf
    return lp + log_likelihood(theta, labels, event)

ndim = 4  # number of parameters
nwalkers = 32  # number of MCMC walkers
nsteps = 2000  # number of steps

# Initial positions: small Gaussian ball around the initial guess
pos = Theta0 + 1e-4 * np.random.randn(nwalkers, ndim)
# Ensure all initial rho are positive and < 0.1
for p in pos:
    p[3] = np.abs(p[3]) % 0.1

sampler = emcee.EnsembleSampler(
    nwalkers, ndim, log_probability, args=(labels, OB170002)
)

print("Running MCMC...")
sampler.run_mcmc(pos, nsteps, progress=True)

# Discard burn-in and flatten the chain
burnin = nsteps // 2
samples = sampler.get_chain(discard=burnin, flat=True)

# Get median and 1-sigma for rho
rho_samples = samples[:, 3]
rho_median = np.median(rho_samples)
rho_std = np.std(rho_samples)

print(f"best fit rho (MCMC) = {rho_median:.5f} ± {rho_std:.5f}")

plt.close(711)
fig, axes = plt.subplots(ndim, 1, figsize=(8, 2.5 * ndim), sharex=True, num=711)

for i in range(ndim):
    ax = axes[i]
    ax.plot(sampler.get_chain()[:, :, i], alpha=0.5)
    ax.set_ylabel(labels[i])
    ax.set_title(labels[i])

axes[-1].set_xlabel('Step')
plt.tight_layout()
plt.show()

plt.close(712)
plt.figure(712)
plt.hist(rho_samples, bins=50, alpha=0.7)
plt.xlabel('rho')
plt.ylabel('Frequency')
plt.title('Posterior distribution of rho')
plt.show()
######################

E7.1 Can you determine the angular size of the source in event OB170002, in units of θ_E?#

No. Rho is unconstrained; the uncertainty recovers the bounds with Nelder-Mead and the posterior is flat with MCMC.

######################
# EXERCISE: SingleLensE7.txt PART: 2
#--------------------- 
plt.close(13)
plt.figure(13)

chi20 = chi2_fun(Theta0, labels, OB170002)  # initial guess chi2
OB170002.plot_model(color='k', label=r'Initial Guess ($\chi^2 = %1.3f$)' %chi20, linestyle=':', t_range=[7000, 8100])
OB170002.plot_data()  # MulensModel automatically fits for the source and blend flux for the  
# given model.
chi2f = chi2_fun(best_fit_params, labels, OB170002)  # the nelder-mead fit
OB170002.plot_model(color='k', label=r'Best Fit ($\chi^2 = %1.3f$)' %chi2f, linestyle='-', t_range=[7000, 8100])

delta_chi2 = chi20 - chi2f
print(f"Delta chi2 = {delta_chi2:.3f}")
relative_prob = np.exp(delta_chi2 / 2)
print(f"Relative probability (best fit vs. initial guess): {relative_prob:.3e}")

plt.legend(loc='best')
plt.title('OB170002')
plt.ylim(12.5, 10)
plt.xlim(7750,8000)
plt.show()
######################

E7.3 Why does it not make sense for your relative probability to have decreased (i.e. be less than 1)?#

It does not make sense for the relative probability to decrease (i.e., for the best-fit finite-source model to be less probable than the initial guess) because the finite-source point-lens (FSPL) model is a generalization of the point-source point-lens (PSPL) model. Specifically, the PSPL model is simply the special case of the FSPL model where the source size parameter, ρ, approaches zero. When fitting the data with the FSPL model, the optimizer is free to choose any value of ρ, including values very close to zero. Therefore, the FSPL fit should always be able to recover the PSPL solution by setting ρ ≈ 0. This means that the best-fit FSPL model should have a chi-squared value that is at least as good as (and typically equal to or better than) the PSPL fit. If your relative probability decreases (i.e., the best-fit FSPL model has a higher chi-squared than the PSPL model), it usually means that the fit did not converge properly, or that the optimizer was stuck at a suboptimal value of ρ (for example, if you started with a random, nonzero ρ and the fit did not move it close to zero). In a properly performed fit, the FSPL model should never be worse than the PSPL model, because it includes the PSPL as a limiting case.

Summary:

The FSPL model can always reproduce the PSPL result by setting ρ very close to zero.

The best-fit FSPL model should never be less probable than the PSPL model.

A decrease in relative probability indicates a problem with the fit, not with the models themselves.

Smashed it! Good job. I have nothing more to teach you, Grasshopper.

Next steps#

If you want to further develop you 1S1L prowess, the RGES-PIT has a data-challenge-style notebook that walks you through bulk fitting seasons of single lens events. It concentrates on robust unattended fitting and computational speed. But it also covers skills like web scraping, dataframes, and model fitting for non-Earth based observatories, using MulensModel. You can find the notebook here:

Unlike this notebook series, the RGES-PIT notebooks are designed to run in Google Colab, and it is recommended that you use Google Colab for the best user experience.

Microlensing modelling is riddled with degeneracies. If we want to gain more understanding about the physical make-up of our event bodies (i.e. size, mass, distance) then we need more information through either the modelling of higher-order effects, follow-up observations, or both. Failing that, we can use Bayesian statistics to determine the most likely physical parameters for an event. If you are interested in determining physical parameters from microlensing events proceed to the following notebooks:

Higher Order Effects (in preperation),
Physical Parameters (in preperation),
Astrometric Follow-up (in development).

If you are interested in binary lens events (e.g., for exoplanet detection), you should consider both of the following notebooks:

Binary Sources,
Binary Lenses (in development),

However, It would be best if you eventually completed these as well

Higher Order Effects (in preperation),
Physical Parameters (in preperation),
All the Lenses (in preperation).

Microlensing events, particularly multiple lens events, have notoriously stochastic likelihood landscapes. This can challange more basic modelling minimisation methods such as the Nelder-Mead method used in this notebook. If you intend to fit microlensing events with more than one lens, or are interested in Bayesian posteriors for your fits, we recommend you also check out the Modelling notebook.