Definition talk:Probability Density Function

From ProofWiki
Jump to navigation Jump to search

Is this defined only when that complicated limit actually exists? Is it the case that it does always exist? --prime mover (talk) 13:55, 20 December 2012 (UTC)

Corrected to point out that it doesn't always exist. The definition is neater when the pdf is considered as the derivative of the cdf, but I was going to add that as a theorem, with a sufficient condition for the pdf existing is that the cdf is differentiable. If you have a more elegant approach by all means please suggest it. The motivation of this definition is to get around the problem that $Pr(X=c) = 0$, should I mention it on the page so that the limit isn't coming out of nowhere? --GFauxPas (talk) 14:37, 20 December 2012 (UTC)
Having thought about it, if $X$ is continuous, then it probably does exist througout. No worries, just thought I'd ask. If you're quoting what Bean says, then trust him not me ... --prime mover (talk) 17:37, 20 December 2012 (UTC)
It wasn't just what you said; I realize now I was confusing piecewise-continuous and continuous. I'll fix it later --GFauxPas (talk) 17:48, 20 December 2012 (UTC)
Okay, I have enough clarity to fix the page. Now, what's the best way to present the following definition? I'm not sure how to present it clearly and rigorously:

Let $X$ be piecewise continuous.

$\forall x \in \R: f_X \left({x}\right) = \begin{cases}

\ds \lim_{\epsilon \to 0^+} \frac{\Pr \left({x-\frac \epsilon 2 \le X \le x + \frac \epsilon 2}\right)} \epsilon & : \text{all $x \in \Omega_X$ except for countably many $x$ at which point $f_X$ has a removable discontinuity} \\ \text {doesn't matter, as long as it's a real number between 0 and 1} &: \text{the countably many $x \in \Omega_X$ that I mentioned in the above line} \\ 0 & : x \notin \Omega_X \end{cases}$

--GFauxPas (talk) 14:18, 21 December 2012 (UTC)

So you are defining $f_X$, and in that definition, you refer to it having a discontinuity? That's not a very good idea. Not sure how to circumvent that, though. --Lord_Farin (talk) 15:39, 21 December 2012 (UTC)
Why is it a bad idea? It's just saying that $X$ need not be everywhere continuous to be a probability measure --GFauxPas (talk) 23:02, 22 December 2012 (UTC)

This seems to basically define (at least A.E. - a PDF is not unique) the PDF as the derivative of the CDF, or at least some one-sided derivative. (rewrite numerator as $\map {F_X} {x + \epsilon/2} - \map {F_X} {x - \epsilon/2}$) I'd prefer this to be a theorem, but I suppose it works as a definition too. There's machinery missing for this though: first we are talking about absolutely continuous random variables, not general continuous random variables, which implies that the CDF is absolutely continuous. (I might add this as a second definition, but definitely at least as a theorem) A result (not yet on this site) in real analysis then says that the CDF is almost everywhere differentiable, meaning that you can define $f_X$ on the complement of a null set (which may be countable or finite [or indeed empty] but could well be bigger than that) by the derivative of the CDF. You can then fill in the missing points as discussed above. The actual value of $f_X$ on this null set is arbitrary, (integration doesn't care about functions that differ only on a null set) so it'd be conventional to send these points to $0$. Caliburn (talk) 16:17, 29 December 2021 (UTC)

I think I had a lapse of concentration, I meant to finish this off with: the proper way to define a PDF seems to be as a particular Radon-Nikodym derivative of the probability distribution of $X$ with respect to the Lebesgue measure of $\R$. I'm still yet to properly set up Radon-Nikodym derivatives, but the proof of their existence and essential uniqueness is already up at Radon-Nikodym Theorem. I'll decide what to do with what's already here closer to the time. Caliburn (talk) 16:31, 29 December 2021 (UTC)

Direction

I am just about to implement the Radon-Nikodym stuff, I propose that we replace this with defining $f_X$ as the Radon-Nikodym derivative $\dfrac {\d P_X} {\d \lambda}$, where $P_X$ is the probability distribution associated with $X$. Terminologically this is intimidating but all we really ask is that $\map \Pr {X \in A} = \int_A f_X \rd \lambda$ for each Borel set $A$. Then we have formalised the PDF as an equivalence class, and it is then a theorem that the PDF is almost everywhere the derivative of the CDF. (speaking concretely: since the CDF is absolutely continuous it is almost everywhere differentiable as mentioned above, filling in these missing points [which form a set of measure zero] with non-negative values then gives a PDF. If the CDF is everywhere differentiable then the derivative works with no extra work) I'm not sure whether the first definition should be kept - without knowing that the CDF is differentiable almost everywhere we don't know whether it's well-defined, and the definition does not try to avoid points where differentiability fails. Also, $X$ must be absolutely continuous with respect to the Lebesgue measure, otherwise a density does not exist. There is also a missing link between the Lebesgue and Riemann integral, this is on my to do list. Caliburn (talk) 11:33, 10 June 2022 (UTC)

Sounds plausible -- I see the groundwork has been done, it makes sense to take it forward in that direction. Mind, I'm not at the sharp end of this by a long way, and so can't be the final arbiter. As always, if we have multiple directions to come to a concept, we document them all and then (at such time we can come up with equivalence proofs) demonstrate equivalence. Glaring discrepancies in definitions such that different definitions define different concepts are done with "formulation 1" and "formulation 2" and a linking page describing the discrepancies, which can then be addressed by whoever has the knowledge. --prime mover (talk) 11:50, 10 June 2022 (UTC)
To keep the whole accessible, I would really like a "naive", real approach and definition to exist side by side with the general one. That is by defining PDF as derivative of CDF provided this derivative exists. Subsequently there would be value in explaining the relationship between general and "naive" and highlighting some of the pitfalls that are avoided/dealt with in the general case. — Lord_Farin (talk) 12:27, 10 June 2022 (UTC)
Ok, I will make it "Definition" and "Naive Definition". I guess we can go forward with similar formats to Definition:Expectation, since I agree that naive treatments of probability fall unhelpfully far from the proper measure theoretic formalism. Caliburn (talk) 14:27, 10 June 2022 (UTC)