Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Stata help

  • 20-03-2010 2:16pm
    #1
    Posts: 5,589 ✭✭✭


    Stata fiends, I need help!

    I have a dataset which has mostly null values the ( . ) and I need to run an if statement but I'm not sure what value to put in for the dot.

    Any help?


Comments

  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    You mean blank cells? Doesn't Stata just omit the obs?


  • Posts: 5,589 ✭✭✭ [Deleted User]


    Yeah, but I have a column that is . . . . . . . some number . . . . . . some other number etc

    I want to create a column that will be 0 if the first column is blank, 1 if not


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    I'm slightly confused. You mean:

    I want to create a column that will be 0 if the first row is blank, 1 if not?


  • Posts: 5,589 ✭✭✭ [Deleted User]


    C1 C2
    . 0
    . 0
    . 0
    4 1
    . 0
    . 0
    . 0
    5 1
    . 0

    This is what I want to end up with. C1 exists at the moment


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    I don't think you can get Stata to recognise the null variables unless you give them a non-indeterminate value. What you are asking for kind of reminds me of a companion form matrix, maybe Stata has a similar tool.

    http://en.wikipedia.org/wiki/Companion_matrix


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 8,452 ✭✭✭Time Magazine


    Does

    gen exists = 1
    replace exists = 0 if myvar == .


    not work?


  • Posts: 5,589 ✭✭✭ [Deleted User]


    Nope, that replaces all the entries with 1.
    Stata doesn't seem to know what to do the blanks.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Do you have the XLS dataset? Maybe you could ctrl+H the dots and replace them with something ("null") that Stata will recognise?


  • Registered Users, Registered Users 2 Posts: 8,452 ✭✭✭Time Magazine


    Have you tried my code above but with =="." instead of ==. ?

    As alluded to by the mod of the Boring forum, you could change this with a bit of brute force. Enter the command edit, select and copy all the cells, paste into Excel, "Find and Replace All" in Excel, paste back into Stata. I feel dirty even suggesting it.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Have you tried my code above but with =="." instead of ==. ?

    As alluded to by the mod of the Boring forum, you could change this with a bit of brute force. Enter the command edit, select and copy all the cells, paste into Excel, "Find and Replace All" in Excel, paste back into Stata. I feel dirty even suggesting it.

    Says the mod of the LaTeX forum.

    :p


  • Advertisement
  • Posts: 5,589 ✭✭✭ [Deleted User]


    Dataset is massive and will be growing, automisation all the way!

    I've gotten a workaround but I'll try that tomorrow. Cheers lads!


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    I'm having a little problem. I am importing a dataset into Stata, but when I go to look at the data, it is highlighting one of my columns (GDP) in red, suggesting that it is non-numerical. Indeed, when I attempt to create a log version of this variable, it tells me that it can't. I have looked up and down the column in question, and can see only numbers. No letters, no spaces.

    Any ideas?


  • Posts: 5,589 ✭✭✭ [Deleted User]


    Somthing wrong in the underlying excel file?


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    That's my only inclination, tbh. Is there any way that I can tell Stata to reclassify the column? Out of interest.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Just remade the CSV from the original XLS file, and the same thing occurred. Is Stata know to have issues with Office 2007? I dont think that could be it, since I have used this combo before.


  • Posts: 5,589 ✭✭✭ [Deleted User]


    i use stat-transfer to make it straight into a dta.

    Copy the column in the original file and see if the duplicate works.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Yeah, thats what I tried. I will give that stat transfer thing a go.


  • Registered Users, Registered Users 2 Posts: 8,452 ✭✭✭Time Magazine




  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving



    That's handy. I figured it out last night anyway, turns out that the commas Excel imposes for thousand values was doing it. There is a setting to switch it off.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Ok, I have a country-level dataset and I am creating three continent dummys; one for Asia, Africa and Other (anything not the other two).

    So, when I regress:

    Y = Asia + Africa + X

    no problems. But when I include the 'Other':

    Y = Asia + Africa + Other + X

    Asia gets "dropped". Asia does not get dropped when:

    Y = Asia + Other + X


    Any ideas?


  • Advertisement
  • Posts: 5,589 ✭✭✭ [Deleted User]


    colinearity?


  • Registered Users, Registered Users 2 Posts: 8,452 ✭✭✭Time Magazine


    Ok, I have a country-level dataset and I am creating three continent dummys; one for Asia, Africa and Other (anything not the other two).

    So, when I regress:

    Y = Asia + Africa + X

    no problems. But when I include the 'Other':

    Y = Asia + Africa + Other + X

    Asia gets "dropped". Asia does not get dropped when:

    Y = Asia + Other + X


    Any ideas?

    Zaraba got it. You have perfect multicollinearity because you fell into the "dummy variable trap."

    Think about it, if you have a male dummy in a regression you see the effect of being male. That implies you're working off a base of being female. Including a female dummy is redundant and will actually lead to singularity problems with your X'X matrix, i.e. perfect multicollinearity. The same thing applies to your three continent example.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Ok, I get that. But I am simply emulating a published paper, where they do just as I described. Any suggestions?


  • Posts: 5,589 ✭✭✭ [Deleted User]


    Treat one of the dummies as the constant.


  • Closed Accounts Posts: 6,609 ✭✭✭Flamed Diving


    Oh right. Good thinking.


  • Posts: 5,589 ✭✭✭ [Deleted User]


    Lesson 1 in undergrad time series work!
    Thank you Mike Harrison.


  • Registered Users, Registered Users 2 Posts: 8,452 ✭✭✭Time Magazine


    Thank you Mike Harrison.

    mjh.jpg

    Greatest man ever.


Advertisement