| > |
file: Mars.mws
===========
7.3.2004
------------------------------------------------------------------------
In this Worksheet, I discuss the derivation of the
Importance weights as function of the location size,
such that the label density on the screen remains constant
at all distances.
Typical applications will be the locations on Mars and Venus
-------------------------------------------------------------------------
| > | with(plots): |
| > | with(stats): with(stats[statplots]): |
Warning, these names have been redefined: anova, describe, fit, importdata, random, statevalf, statplots, transform
Warning, these names have been redefined: boxplot, histogram, scatterplot, xscale, xshift, xyexchange, xzexchange, yscale, yshift, yzexchange, zscale, zshift
The file 'size.txt' contains a 1-dim array of (nonvanishing) sizes [km] of the 1327 locations for Mars . Read it into Maple:
| > | data:=readdata("size.txt",float): |
Calculate log base 10 for each element:
| > | size:=evalf(map(log10,data)): |
histogram plot of log10(size) distribution:
| > | ph:=histogram(size,area=count,axes=boxed,labels=["log10(size)","number of labels"],labeldirections=[horizontal,vertical]): |
| > | display(ph); |
Extract the numerical values from histogram plot structure 'ph':
| > | dbx:=[];dby:=[];dbxy:=[]; |
| > | for i from 1 to 12 do |
| > | dbx:=[op(dbx),op([1,i],ph)[2][1]]; |
| > | dby:=[op(dby),op([1,i],ph)[2][2]]; |
| > | dbxy:=[op(dbxy),op([1,i,2],ph)]; |
| > | end do: |
| > | dbxy; |
| > | dbx; |
| > | dby; |
| > | 3.377787718-3.023181676; |
Compute the total number of towns in all 12 bins:
| > | GG:=j->sum(dby['i'],'i'=j..12); |
| > | for j from 1 to 12 do |
| > | GG(j); |
| > | od: |
| > | GG(1); |
Aha, they sum up to the total count of cities in the data base.
Test on a Normal distribution of log10(size) around log10(s0),
with the best s0 being found by trial & error
| > | assume(v>0); |
| > | dnLabelsdx:=sqrt(v/Pi)*nLabels_tot*exp(-v*(x-log10(s0))^2); |
| > | int(dnLabelsdx,x=-infinity..infinity); |
OK, the distribution correctly integrates to the total number
of Mars locations = 1327 with non-vanishing size
By trial & error, S0 in Km,
| > | S0:=55; |
Instead of x=log10(size), consider (x-log10(s0))^2 as stochastic variable
as well as ln(dnLabelsdx),
| > | XL:=evalf(subs(s0=S0,map(x->(x-log10(s0))^2,dbx))); |
form ln of y-values (counts), to make the fit function linear in parameters:
| > | YL:=map(x->ln(x),dby); |
| > | Y:=y=collect(combine(simplify(subs((x-ln(s0)/ln(10))^2=x,ln(dnLabelsdx)),symbolic),symbolic),[x],factor); |
Do a leastsquare fit to log(normal distribution):
| > | w:=fit[leastsquare[ [x,y],y=a-b*x] ]([XL,YL]); |
| > | cc:=solve({coeff(rhs(w-Y),x,0),coeff(rhs(w-Y),x,1)},{v,nLabels_tot}); |
| > | pt:=plot(subs(x=(x-log10(S0))^2,cc[1],cc[2],exp(rhs(Y))),x=-1..4.5,color=red,thickness=2): |
| > | display({ph,pt},labels=["log10( size )","Number of Labels"],labeldirections=[horizontal,vertical], title="Normal Distribution of log10(size) around log10(s0=50)", titlefont=[HELVETICA,20],OPTS); |
OK, not perfect, but approximately compatible with a normal distribution...
-------------------------------------------------------------------------------------
Next, want to derive the Importance weights I,
such that the label density on the monitor remains always constant!
-------------------------------------------------------------------------------------
Strategy:
=======
i) Let nLabels = (number of visible labels) at distance d of our object (Mars, Venus,...),
having an area A(d) =
on screen in [pix^2].
=============================================
Require that the visible label density is about constant
at all distances d [FoV's] of our object, i.e
= constant
=============================================
ii) For the given monitor resolution, and a range of 'importance weights I',
determine empirically the distances d = d_vis(I) of our object, for which
the associated labels just become visible .
It is a linear relation as expected (see below).
d_vis = 14.8 +86.9*I [km]
Thus the requirement of a constant label density turns into a formula
for the importance weights I
iii) On Earth I calculated nLabels = nLabels(population) from
the known data on city populations. For Mars, Venus,...
we may as well take the number-distribution of the
location sizes.
Above we obtained approximately a Normal distribution
nLabels = Normal(log10(size))
around s0 = 50.
iv) We may feed this in and determine the only unknown constant
by requiring a convenient number of visible labels at a certain
distance of the object. E.g. for Earth,
at a distance
of 40000 km.
-------------------------------------------------------------------------
Our problem of expressing the weights as function of the known location sizes
such as to keep the label density on the screen constant, is solved!
Let's get quantitative:
| > | distance:=[187,325,520,999,2085,6107,9835,15708,24880,33900]; |
| > | importance:=[2.2,3.84,6.11,11.49,24.08,70.13,112.72,178.9,285.68,391]; |
| > | distimp:=[[2.2,187],[3.84,325],[6.11,520],[11.49,999],[24.08,2085],[70.13,6107],[112.72,9835],[178.9,15708],[285.68,24880],[391,33900]]; |
| > | q0:=pointplot(distimp,symbol=BOX,color=blue,symbolsize=20): |
Again: least square fit of linear relation: min. distance <=> Importance weight
| > | fit[leastsquare[[x,y],y=a+b*x]]([importance,distance]); |
| > | q1:=plot(14.82791092+86.91039073*imp,imp=1..1000,color=red,thickness=2): |
| > | display({q0,q1},axes=boxed,labels=["Importance weight","min. distance [ km ], where visibility starts"], labeldirections=[horizontal,vertical]); |
Aha, an excellent fit!
--------------------------
Next, since we want the total number of visible labels for a given log10(size)= xt,
we must integrate from xt to 'infinity' (all labels corresponding to a higher population than xt are also visible!):
| > | Int(subs(x=(x-log10(S0))^2,cc[1],cc[2],exp(rhs(Y))),x=xt..infinity)=int(subs(x=(x-log10(S0))^2,cc[1],cc[2],exp(rhs(Y))),x=xt..infinity); |
Define a function from the result and divide by binwidth:
| > | nLabels:=xt->evalf(1/.354606042*(-233.3297812*erf(.8226361874*xt-1.431685327)+233.3297812)); |
Let's see what the total number of labels becomes? Close to 1327?
| > | nLabels(-infinity); |
YES, indeed, it's not at all bad!
Plot the integrated number of totally visible labels vs. xt=log10(size):
| > | plot(nLabels(x),x=-1..4,OPTS); |
Next we calculate the Importance weights, as outlined above, from
| > | Importance:=expand(solve(nLabels=(c/(14.82791092+86.91039073*imp))^2,imp)[1]); |
c is the constant to be determined e.g. from the requirement of seeing 20 labels (10/hemisphere) at a distance of 40000km:
| > | evalf(solve(20=(c/40000)^2,c)); |
For general c, we get:
| > | solve(nLabs=(C/40000)^2,C)[1]; |
| > | Imp:=evalf(subs(nLabels=nLabels(log10(s)),c=40000*nLabs^(1/2),-.1706114861+.1150610407e-1/nLabels^(1/2)*c)); |
| > | solve(nLabs=(C/40000)^2,C); |
| > | solve(log10(s)=4,s); |
| > | w1:=loglogplot(subs(nLabs=10,Imp),s=0.1..10000,axes=boxed,labels=["Size [km]","Importance Weight"],color=red,OPTS,numpoints=5000): |
| > | w2:=loglogplot(subs(nLabs=20,Imp),s=0.1..10000,axes=boxed,labels=["Size [km]","Importance Weight"],color=blue,numpoints=5000,OPTS): |
| > | w3:=loglogplot(subs(nLabs=5,Imp),s=0.1..10000,axes=boxed,labels=["Size [km]","Importance Weight"],color=green,numpoints=5000,OPTS): |
| > | display({w1,w2,w3}); |
| > |
-------------------------------------------------------------------------------------
This solves the problem, the above function is entered into my Perl script
which assigns the Importance weights accordingly!
--------------------------------------------------------------------------------------