>   

libname :=

savelibname :=

file: Mars.mws

===========

7.3.2004

------------------------------------------------------------------------

In this Worksheet, I discuss the derivation of the

Importance weights as function of the location size,

such that the label density on the screen remains constant

at all distances.

Typical applications will be the locations on Mars and Venus

-------------------------------------------------------------------------

>    with(plots):

>    with(stats): with(stats[statplots]):

Warning, these names have been redefined: anova, describe, fit, importdata, random, statevalf, statplots, transform

Warning, these names have been redefined: boxplot, histogram, scatterplot, xscale, xshift, xyexchange, xzexchange, yscale, yshift, yzexchange, zscale, zshift

The file 'size.txt' contains a 1-dim array of (nonvanishing) sizes [km]  of the 1327 locations for Mars . Read it into Maple:

>    data:=readdata("size.txt",float):

Calculate log base 10 for each element:

>    size:=evalf(map(log10,data)):

histogram plot of log10(size) distribution:

>    ph:=histogram(size,area=count,axes=boxed,labels=["log10(size)","number of labels"],labeldirections=[horizontal,vertical]):

>    display(ph);

[Maple Plot]

Extract the numerical values from histogram plot structure 'ph':

>    dbx:=[];dby:=[];dbxy:=[];

dbx := []

dby := []

dbxy := []

>    for i from 1 to 12 do

>    dbx:=[op(dbx),op([1,i],ph)[2][1]];

>    dby:=[op(dby),op([1,i],ph)[2][2]];

>    dbxy:=[op(dbxy),op([1,i,2],ph)];

>    end do:

>    dbxy;

[[-.5228787453, 7.000000001], [-.1682727032, 19.00000000], [.1863333389, 67.00000005], [.5409393807, 105.0000000], [.8955454227, 214.9999999], [1.250151465, 174.9999996], [1.604757508, 170.0000001], [1...
[[-.5228787453, 7.000000001], [-.1682727032, 19.00000000], [.1863333389, 67.00000005], [.5409393807, 105.0000000], [.8955454227, 214.9999999], [1.250151465, 174.9999996], [1.604757508, 170.0000001], [1...
[[-.5228787453, 7.000000001], [-.1682727032, 19.00000000], [.1863333389, 67.00000005], [.5409393807, 105.0000000], [.8955454227, 214.9999999], [1.250151465, 174.9999996], [1.604757508, 170.0000001], [1...
[[-.5228787453, 7.000000001], [-.1682727032, 19.00000000], [.1863333389, 67.00000005], [.5409393807, 105.0000000], [.8955454227, 214.9999999], [1.250151465, 174.9999996], [1.604757508, 170.0000001], [1...

>    dbx;

[-.5228787453, -.1682727032, .1863333389, .5409393807, .8955454227, 1.250151465, 1.604757508, 1.959363550, 2.313969592, 2.668575634, 3.023181676, 3.377787718]
[-.5228787453, -.1682727032, .1863333389, .5409393807, .8955454227, 1.250151465, 1.604757508, 1.959363550, 2.313969592, 2.668575634, 3.023181676, 3.377787718]

>    dby;

[7.000000001, 19.00000000, 67.00000005, 105.0000000, 214.9999999, 174.9999996, 170.0000001, 238.0000001, 139.0000000, 119.0000000, 53.00000001, 20.00000001]
[7.000000001, 19.00000000, 67.00000005, 105.0000000, 214.9999999, 174.9999996, 170.0000001, 238.0000001, 139.0000000, 119.0000000, 53.00000001, 20.00000001]

>    3.377787718-3.023181676;

.354606042

Compute the total number of towns in all 12 bins:

>    GG:=j->sum(dby['i'],'i'=j..12);

GG := proc (j) options operator, arrow; sum(dby['i'],('i') = j .. 12) end proc

>    for j from 1 to 12 do

>    GG(j);

>    od:

>    GG(1);

1327.000000

Aha, they sum up to the total count of cities in the data base.

Test on a Normal distribution of log10(size) around log10(s0),

with the best s0 being found by trial & error

>    assume(v>0);

>    dnLabelsdx:=sqrt(v/Pi)*nLabels_tot*exp(-v*(x-log10(s0))^2);

dnLabelsdx := (v/Pi)^(1/2)*nLabels_tot*exp(-v*(x-ln(s0)/ln(10))^2)

>    int(dnLabelsdx,x=-infinity..infinity);

nLabels_tot

OK, the distribution correctly  integrates to the total number

of Mars locations = 1327 with non-vanishing size

By trial & error, S0 in Km,

>    S0:=55;

S0 := 55

Instead of x=log10(size), consider (x-log10(s0))^2 as stochastic variable

as well as ln(dnLabelsdx),

>    XL:=evalf(subs(s0=S0,map(x->(x-log10(s0))^2,dbx)));

XL := [5.122261789, 3.642889060, 2.415007221, 1.438616272, .7137162134, .2403070441, .1838876511e-1, .4796137712e-1, .3290248792, .8615792713, 1.645624553, 2.681160726]
XL := [5.122261789, 3.642889060, 2.415007221, 1.438616272, .7137162134, .2403070441, .1838876511e-1, .4796137712e-1, .3290248792, .8615792713, 1.645624553, 2.681160726]

form ln of y-values (counts), to make the fit function linear in parameters:

>    YL:=map(x->ln(x),dby);

YL := [1.945910149, 2.944438979, 4.204692620, 4.653960350, 5.370638028, 5.164785972, 5.135798438, 5.472270674, 4.934473933, 4.779123493, 3.970291914, 2.995732274]
YL := [1.945910149, 2.944438979, 4.204692620, 4.653960350, 5.370638028, 5.164785972, 5.135798438, 5.472270674, 4.934473933, 4.779123493, 3.970291914, 2.995732274]

>    Y:=y=collect(combine(simplify(subs((x-ln(s0)/ln(10))^2=x,ln(dnLabelsdx)),symbolic),symbolic),[x],factor);

Y := y = -v*x+ln(1/Pi^(1/2)*v^(1/2)*nLabels_tot)

Do a leastsquare fit to log(normal distribution):

>    w:=fit[leastsquare[ [x,y],y=a-b*x] ]([XL,YL]);

w := y = 5.377993828-.6767302970*x

>    cc:=solve({coeff(rhs(w-Y),x,0),coeff(rhs(w-Y),x,1)},{v,nLabels_tot});

cc := {v = .6767302970, nLabels_tot = 466.6595624}

>    pt:=plot(subs(x=(x-log10(S0))^2,cc[1],cc[2],exp(rhs(Y))),x=-1..4.5,color=red,thickness=2):

>    display({ph,pt},labels=["log10( size )","Number of Labels"],labeldirections=[horizontal,vertical], title="Normal Distribution of log10(size) around log10(s0=50)", titlefont=[HELVETICA,20],OPTS);

[Maple Plot]

OK, not perfect, but approximately compatible with a normal distribution...

-------------------------------------------------------------------------------------

Next, want to derive the Importance weights I,

such that the label density on the monitor remains always constant!

-------------------------------------------------------------------------------------

Strategy:

=======

i) Let  nLabels = (number of visible labels) at distance d of our object (Mars, Venus,...),

   having an  area A(d) = (const/d)^2 on screen in [pix^2].

 

   =============================================

   Require that the visible label density is about constant

   at all distances d [FoV's] of our object, i.e  

   

    nLabels/A(d) = const*nLabels*d^2  = constant

   =============================================

 

ii) For the given monitor resolution, and a range of 'importance weights I',

    determine empirically the distances d = d_vis(I) of our object, for which

    the associated labels just become visible .

    It is a linear relation  as expected (see below).

   

    d_vis = 14.8 +86.9*I  [km]

  

   Thus the requirement of a constant label density turns into a formula

   for the importance weights I

  

    I = const/sqrt(nLabels)-14.8/86.9

iii) On Earth I calculated nLabels = nLabels(population) from

    the known data on city populations. For Mars, Venus,...

    we may as well take the number-distribution of the

    location sizes.

   Above we obtained approximately a Normal distribution

   nLabels = Normal(log10(size))

   around s0 = 50.

iv) We may feed this in and determine the only unknown constant

     by requiring a convenient number of visible labels at a certain

     distance of the object. E.g.  for Earth, 10/hemisphere   at a distance

     of 40000 km.

-------------------------------------------------------------------------

 

Our problem of expressing the weights as function of the known location sizes

such as to keep the label density on the screen constant,  is solved!

Let's get quantitative:

>    distance:=[187,325,520,999,2085,6107,9835,15708,24880,33900];

distance := [187, 325, 520, 999, 2085, 6107, 9835, 15708, 24880, 33900]

>    importance:=[2.2,3.84,6.11,11.49,24.08,70.13,112.72,178.9,285.68,391];

importance := [2.2, 3.84, 6.11, 11.49, 24.08, 70.13, 112.72, 178.9, 285.68, 391]

>    distimp:=[[2.2,187],[3.84,325],[6.11,520],[11.49,999],[24.08,2085],[70.13,6107],[112.72,9835],[178.9,15708],[285.68,24880],[391,33900]];

distimp := [[2.2, 187], [3.84, 325], [6.11, 520], [11.49, 999], [24.08, 2085], [70.13, 6107], [112.72, 9835], [178.9, 15708], [285.68, 24880], [391, 33900]]
distimp := [[2.2, 187], [3.84, 325], [6.11, 520], [11.49, 999], [24.08, 2085], [70.13, 6107], [112.72, 9835], [178.9, 15708], [285.68, 24880], [391, 33900]]

>    q0:=pointplot(distimp,symbol=BOX,color=blue,symbolsize=20):

Again: least square fit of linear relation: min. distance <=> Importance weight

>    fit[leastsquare[[x,y],y=a+b*x]]([importance,distance]);

y = 14.82791092+86.91039073*x

>    q1:=plot(14.82791092+86.91039073*imp,imp=1..1000,color=red,thickness=2):

>    display({q0,q1},axes=boxed,labels=["Importance weight","min. distance [ km ], where visibility starts"], labeldirections=[horizontal,vertical]);

[Maple Plot]

Aha, an excellent fit!

 --------------------------

Next, since we want the total number of visible labels for a given log10(size)= xt,

we must integrate from xt to 'infinity' (all labels corresponding to a higher population than xt are also visible!):

>    Int(subs(x=(x-log10(S0))^2,cc[1],cc[2],exp(rhs(Y))),x=xt..infinity)=int(subs(x=(x-log10(S0))^2,cc[1],cc[2],exp(rhs(Y))),x=xt..infinity);

Int(exp(-.6767302970*(x-ln(55)/ln(10))^2+ln(383.8910433/Pi^(1/2))),x = xt .. infinity) = -233.3297812*erf(.8226361874*xt-1.431685327)+233.3297812
Int(exp(-.6767302970*(x-ln(55)/ln(10))^2+ln(383.8910433/Pi^(1/2))),x = xt .. infinity) = -233.3297812*erf(.8226361874*xt-1.431685327)+233.3297812

Define a function from the result and divide by binwidth:

>    nLabels:=xt->evalf(1/.354606042*(-233.3297812*erf(.8226361874*xt-1.431685327)+233.3297812));

nLabels := proc (xt) options operator, arrow; evalf(1/.354606042*(-233.3297812*erf(.8226361874*xt-1.431685327)+233.3297812)) end proc

Let's see what the total number of labels becomes? Close to 1327?

>    nLabels(-infinity);

1315.994391

YES, indeed, it's not at all bad!

Plot the integrated number of totally visible labels vs. xt=log10(size):

>    plot(nLabels(x),x=-1..4,OPTS);

[Maple Plot]

Next we calculate the Importance weights, as outlined above, from

>    Importance:=expand(solve(nLabels=(c/(14.82791092+86.91039073*imp))^2,imp)[1]);

Importance := -.1706114861+.1150610407e-1/nLabels^(1/2)*c

c is the constant to be determined e.g.  from the requirement of seeing 20 labels (10/hemisphere) at a distance of 40000km:

>    evalf(solve(20=(c/40000)^2,c));

-178885.4382, 178885.4382

For general c, we get:

>    solve(nLabs=(C/40000)^2,C)[1];

40000*nLabs^(1/2)

>    Imp:=evalf(subs(nLabels=nLabels(log10(s)),c=40000*nLabs^(1/2),-.1706114861+.1150610407e-1/nLabels^(1/2)*c));

Imp := -.1706114861+460.2441628/(657.9971957*erf(-.3572663568*ln(s)+1.431685327)+657.9971957)^(1/2)*nLabs^(1/2)

>    solve(nLabs=(C/40000)^2,C);

40000*nLabs^(1/2), -40000*nLabs^(1/2)

>    solve(log10(s)=4,s);

10000

>    w1:=loglogplot(subs(nLabs=10,Imp),s=0.1..10000,axes=boxed,labels=["Size [km]","Importance Weight"],color=red,OPTS,numpoints=5000):

>    w2:=loglogplot(subs(nLabs=20,Imp),s=0.1..10000,axes=boxed,labels=["Size [km]","Importance Weight"],color=blue,numpoints=5000,OPTS):

>    w3:=loglogplot(subs(nLabs=5,Imp),s=0.1..10000,axes=boxed,labels=["Size [km]","Importance Weight"],color=green,numpoints=5000,OPTS):

>    display({w1,w2,w3});

[Maple Plot]

>   

-------------------------------------------------------------------------------------

This solves the problem, the above function is entered into my Perl script

which assigns the Importance weights accordingly!

--------------------------------------------------------------------------------------